首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 421 毫秒
1.
We develop three Bayesian predictive probability functions based on data in the form of a double sample. One Bayesian predictive probability function is for predicting the true unobservable count of interest in a future sample for a Poisson model with data subject to misclassification and two Bayesian predictive probability functions for predicting the number of misclassified counts in a current observable fallible count for an event of interest. We formulate a Gibbs sampler to calculate prediction intervals for these three unobservable random variables and apply our new predictive models to calculate prediction intervals for a real‐data example. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

2.
We have considered a Bayesian approach for the nonlinear regression model by replacing the normal distribution on the error term by some skewed distributions, which account for both skewness and heavy tails or skewness alone. The type of data considered in this paper concerns repeated measurements taken in time on a set of individuals. Such multiple observations on the same individual generally produce serially correlated outcomes. Thus, additionally, our model does allow for a correlation between observations made from the same individual. We have illustrated the procedure using a data set to study the growth curves of a clinic measurement of a group of pregnant women from an obstetrics clinic in Santiago, Chile. Parameter estimation and prediction were carried out using appropriate posterior simulation schemes based in Markov Chain Monte Carlo methods. Besides the deviance information criterion (DIC) and the conditional predictive ordinate (CPO), we suggest the use of proper scoring rules based on the posterior predictive distribution for comparing models. For our data set, all these criteria chose the skew‐t model as the best model for the errors. These DIC and CPO criteria are also validated, for the model proposed here, through a simulation study. As a conclusion of this study, the DIC criterion is not trustful for this kind of complex model.  相似文献   

3.
Although it is widely recognized that climate change will require a major spatial reorganization of forests, our ability to predict exactly how and where forest characteristics and distributions will change has been rather limited. Current efforts to predict future distribution of forested ecosystems as a function of climate include species distribution models (for fine‐scale predictions) and potential vegetation climate envelope models (for coarse‐grained, large‐scale predictions). Here, we develop and apply an intermediate approach wherein we use stand‐level tolerances of environmental stressors to understand forest distributions and vulnerabilities to anticipated climate change. In contrast to other existing models, this approach can be applied at a continental scale while maintaining a direct link to ecologically relevant, climate‐related stressors. We first demonstrate that shade, drought, and waterlogging tolerances of forest stands are strongly correlated with climate and edaphic conditions in the conterminous United States. This discovery allows the development of a tolerance distribution model (TDM), a novel quantitative tool to assess landscape level impacts of climate change. We then focus on evaluating the implications of the drought TDM. Using an ensemble of 17 climate change models to drive this TDM, we estimate that 18% of US ecosystems are vulnerable to drought‐related stress over the coming century. Vulnerable areas include mostly the Midwest United States and Northeast United States, as well as high‐elevation areas of the Rocky Mountains. We also infer stress incurred by shifting climate should create an opening for the establishment of forest types not currently seen in the conterminous United States.  相似文献   

4.
Amblyomma americanum (L.) is a three‐host tick that spends most of its life off‐host and is an important vector of pathogens in the eastern United States. Our objectives were to develop a predictive statistical model that describes the number of active, off‐host larvae from 2007 to 2011 and to determine the environmental variables associated with this pattern. Data used in this study came from monitoring conducted in northeast Missouri in which off‐host ticks were collected from a permanent plot in a forest and an old field habitat every other week from approximately February to December. Variables examined were day length, degree days, total precipitation prior to sampling, wind speed, saturation deficit, number of adults prior to sampling, and collection site. Of the four regression models tested, the negative binomial model was selected. Fitted candidate models were compared relative to one another using values of eight model selection criteria and model averaging was used to develop a predictive model. The residual plots indicated that the weighted average model performs well in describing the number of larvae. Of the variables considered, the number of larvae was most strongly associated with increasing degree days, the number of active adults prior to sampling, and the forested site.  相似文献   

5.
There are an increasing number of studies that are now focussing on the influence of climate change on species’ distributions. However, access to predictive climatic datasets for future scenarios is difficult due to their specific formats and/or the need to be geographically downscaled. The TYN dataset is freely available to users and provides a synthetic format with several climatic models and IPCC future climate scenarios. Moreover, the CRU historical dataset (1901–2000) is also available which allows users to create baseline models for current climatic variables. E‐clic is a free, user‐friendly software package that offers three different ways to convert these two datasets into a spatially explicit raster format which is compatible with the most common geographic information systems and usable on different platforms.  相似文献   

6.
Analog forecasting is a mechanism‐free nonlinear method that forecasts a system forward in time by examining how past states deemed similar to the current state moved forward. Previous applications of analog forecasting has been successful at producing robust forecasts for a variety of ecological and physical processes, but it has typically been presented in an empirical or heuristic procedure, rather than as a formal statistical model. The methodology presented here extends the model‐based analog method of McDermott and Wikle (Environmetrics, 27, 2016, 70) by placing analog forecasting within a fully hierarchical statistical framework that can accommodate count observations. Using a Bayesian approach, the hierarchical analog model is able to quantify rigorously the uncertainty associated with forecasts. Forecasting waterfowl settling patterns in the northwestern United States and Canada is conducted by applying the hierarchical analog model to a breeding population survey dataset. Sea surface temperature (SST) in the Pacific Ocean is used to help identify potential analogs for the waterfowl settling patterns.  相似文献   

7.
Summary This article introduces new methods for performing classification of complex, high‐dimensional functional data using the functional mixed model (FMM) framework. The FMM relates a functional response to a set of predictors through functional fixed and random effects, which allows it to account for various factors and between‐function correlations. The methods include training and prediction steps. In the training steps we train the FMM model by treating class designation as one of the fixed effects, and in the prediction steps we classify the new objects using posterior predictive probabilities of class. Through a Bayesian scheme, we are able to adjust for factors affecting both the functions and the class designations. While the methods can be used in any FMM framework, we provide details for two specific Bayesian approaches: the Gaussian, wavelet‐based FMM (G‐WFMM) and the robust, wavelet‐based FMM (R‐WFMM). Both methods perform modeling in the wavelet space, which yields parsimonious representations for the functions, and can naturally adapt to local features and complex nonstationarities in the functions. The R‐WFMM allows potentially heavier tails for features of the functions indexed by particular wavelet coefficients, leading to a down‐weighting of outliers that makes the method robust to outlying functions or regions of functions. The models are applied to a pancreatic cancer mass spectroscopy data set and compared with other recently developed functional classification methods.  相似文献   

8.
The projection of age‐stratified cancer incidence and mortality rates is of great interest due to demographic changes, but also therapeutical and diagnostic developments. Bayesian age–period–cohort (APC) models are well suited for the analysis of such data, but are not yet used in routine practice of epidemiologists. Reasons may include that Bayesian APC models have been criticized to produce too wide prediction intervals. Furthermore, the fitting of Bayesian APC models is usually done using Markov chain Monte Carlo (MCMC), which introduces complex convergence concerns and may be subject to additional technical problems. In this paper we address both concerns, developing efficient MCMC‐free software for routine use in epidemiological applications. We apply Bayesian APC models to annual lung cancer data for females in five different countries, previously analyzed in the literature. To assess the predictive quality, we omit the observations from the last 10 years and compare the projections with the actual observed data based on the absolute error and the continuous ranked probability score. Further, we assess calibration of the one‐step‐ahead predictive distributions. In our application, the probabilistic forecasts obtained by the Bayesian APC model are well calibrated and not too wide. A comparison to projections obtained by a generalized Lee–Carter model is also given. The methodology is implemented in the user‐friendly R‐package BAPC using integrated nested Laplace approximations.  相似文献   

9.
Occupancy models (Ecology, 2002; 83: 2248) were developed to infer the probability that a species under investigation occupies a site. Bayesian analysis of these models can be undertaken using statistical packages such as WinBUGS, OpenBUGS, JAGS, and more recently Stan, however, since these packages were not developed specifically to fit occupancy models, one often experiences long run times when undertaking an analysis. Bayesian spatial single‐season occupancy models can also be fit using the R package stocc. The approach assumes that the detection and occupancy regression effects are modeled using probit link functions. The use of the logistic link function, however, is algebraically more tractable and allows one to easily interpret the coefficient effects of an estimated model by using odds ratios, which is not easily done for a probit link function for models that do not include spatial random effects. We develop a Gibbs sampler to obtain posterior samples from the posterior distribution of the parameters of various occupancy models (nonspatial and spatial) when logit link functions are used to model the regression effects of the detection and occupancy processes. We apply our methods to data extracted from the 2nd Southern African Bird Atlas Project to produce a species distribution map of the Cape weaver (Ploceus capensis) and helmeted guineafowl (Numida meleagris) for South Africa. We found that the Gibbs sampling algorithm developed produces posterior samples that are identical to those obtained when using JAGS and Stan and that in certain cases the posterior chains mix much faster than those obtained when using JAGS, stocc, and Stan. Our algorithms are implemented in the R package, Rcppocc. The software is freely available and stored on GitHub ( https://github.com/AllanClark/Rcppocc ).  相似文献   

10.
When analyzing the geographical variations of disease risk, one common problem is data sparseness. In such a setting, we investigate the possibility of using Bayesian shared spatial component models to strengthen inference and correct for any spatially structured sources of bias, when distinct data sources on one or more related diseases are available. Specifically, we apply our models to analyze the spatial variation of risk of two forms of scrapie infection affecting sheep in Wales (UK) using three surveillance sources on each disease. We first model each disease separately from the combined data sources and then extend our approach to jointly analyze diseases and data sources. We assess the predictive performances of several nested joint models through pseudo cross‐validatory predictive model checks.  相似文献   

11.
Myriad human activities increasingly threaten the existence of many species. A variety of conservation interventions such as habitat restoration, protected areas, and captive breeding have been used to prevent extinctions. Evaluating the effectiveness of these interventions requires appropriate statistical methods, given the quantity and quality of available data. Historically, analysis of variance has been used with some form of predetermined before‐after control‐impact design to estimate the effects of large‐scale experiments or conservation interventions. However, ad hoc retrospective study designs or the presence of random effects at multiple scales may preclude the use of these tools. We evaluated the effects of a large‐scale supplementation program on the density of adult Chinook salmon Oncorhynchus tshawytscha from the Snake River basin in the northwestern United States currently listed under the U.S. Endangered Species Act. We analyzed 43 years of data from 22 populations, accounting for random effects across time and space using a form of Bayesian hierarchical time‐series model common in analyses of financial markets. We found that varying degrees of supplementation over a period of 25 years increased the density of natural‐origin adults, on average, by 0–8% relative to nonsupplementation years. Thirty‐nine of the 43 year effects were at least two times larger in magnitude than the mean supplementation effect, suggesting common environmental variables play a more important role in driving interannual variability in adult density. Additional residual variation in density varied considerably across the region, but there was no systematic difference between supplemented and reference populations. Our results demonstrate the power of hierarchical Bayesian models to detect the diffuse effects of management interventions and to quantitatively describe the variability of intervention success. Nevertheless, our study could not address whether ecological factors (e.g., competition) were more important than genetic considerations (e.g., inbreeding depression) in determining the response to supplementation.  相似文献   

12.
Detection-nondetection data are often used to investigate species range dynamics using Bayesian occupancy models which rely on the use of Markov chain Monte Carlo (MCMC) methods to sample from the posterior distribution of the parameters of the model. In this article we develop two Variational Bayes (VB) approximations to the posterior distribution of the parameters of a single-season site occupancy model which uses logistic link functions to model the probability of species occurrence at sites and of species detection probabilities. This task is accomplished through the development of iterative algorithms that do not use MCMC methods. Simulations and small practical examples demonstrate the effectiveness of the proposed technique. We specifically show that (under certain circumstances) the variational distributions can provide accurate approximations to the true posterior distributions of the parameters of the model when the number of visits per site (K) are as low as three and that the accuracy of the approximations improves as K increases. We also show that the methodology can be used to obtain the posterior distribution of the predictive distribution of the proportion of sites occupied (PAO).  相似文献   

13.
Correlated binary response data with covariates are ubiquitous in longitudinal or spatial studies. Among the existing statistical models, the most well-known one for this type of data is the multivariate probit model, which uses a Gaussian link to model dependence at the latent level. However, a symmetric link may not be appropriate if the data are highly imbalanced. Here, we propose a multivariate skew-elliptical link model for correlated binary responses, which includes the multivariate probit model as a special case. Furthermore, we perform Bayesian inference for this new model and prove that the regression coefficients have a closed-form unified skew-elliptical posterior with an elliptical prior. The new methodology is illustrated by an application to COVID-19 data from three different counties of the state of California, USA. By jointly modeling extreme spikes in weekly new cases, our results show that the spatial dependence cannot be neglected. Furthermore, the results also show that the skewed latent structure of our proposed model improves the flexibility of the multivariate probit model and provides a better fit to our highly imbalanced dataset.  相似文献   

14.
Shared random effects joint models are becoming increasingly popular for investigating the relationship between longitudinal and time‐to‐event data. Although appealing, such complex models are computationally intensive, and quick, approximate methods may provide a reasonable alternative. In this paper, we first compare the shared random effects model with two approximate approaches: a naïve proportional hazards model with time‐dependent covariate and a two‐stage joint model, which uses plug‐in estimates of the fitted values from a longitudinal analysis as covariates in a survival model. We show that the approximate approaches should be avoided since they can severely underestimate any association between the current underlying longitudinal value and the event hazard. We present classical and Bayesian implementations of the shared random effects model and highlight the advantages of the latter for making predictions. We then apply the models described to a study of abdominal aortic aneurysms (AAA) to investigate the association between AAA diameter and the hazard of AAA rupture. Out‐of‐sample predictions of future AAA growth and hazard of rupture are derived from Bayesian posterior predictive distributions, which are easily calculated within an MCMC framework. Finally, using a multivariate survival sub‐model we show that underlying diameter rather than the rate of growth is the most important predictor of AAA rupture.  相似文献   

15.
Global nutrient cycles have been altered by the use of fossil fuels and fertilizers resulting in increases in nutrient loads to aquatic systems. In the United States, excess nutrients have been repeatedly reported as the primary cause of lake water quality impairments. Setting nutrient criteria that are protective of a lakes ecological condition is one common solution; however, the data required to do this are not always easily available. A useful solution for this is to combine available field data (i.e., The United States Environmental Protection Agency (USEPA) National Lake Assessment (NLA)) with average annual nutrient load models (i.e., USGS SPARROW model) to estimate summer concentrations across a large number of lakes. In this paper we use this combined approach and compare the observed total nitrogen (TN) and total phosphorus (TN) concentrations in Northeastern lakes from the 2007 National Lake Assessment to those predicted by the Northeast SPARROW model. We successfully adjusted the SPARROW predictions to the NLA observations with the use of Vollenweider equations, simple input-output models that predict nutrient concentrations in lakes based on nutrient loads and hydraulic residence time. This allows us to better predict summer concentrations of TN and TP in Northeastern lakes and ponds. On average we improved our predicted concentrations of TN and TP with Vollenweider models by 18.7% for nitrogen and 19.0% for phosphorus. These improved predictions are being used in other studies to model ecosystem services (e.g., aesthetics) and dis-services (e.g. cyanobacterial blooms) for ~18,000 lakes in the Northeastern United States.  相似文献   

16.
Several statistical methods have been proposed for estimating the infection prevalence based on pooled samples, but these methods generally presume the application of perfect diagnostic tests, which in practice do not exist. To optimize prevalence estimation based on pooled samples, currently available and new statistical models were described and compared. Three groups were tested: (a) Frequentist models, (b) Monte Carlo Markov‐Chain (MCMC) Bayesian models, and (c) Exact Bayesian Computation (EBC) models. Simulated data allowed the comparison of the models, including testing the performance under complex situations such as imperfect tests with a sensitivity varying according to the pool weight. In addition, all models were applied to data derived from the literature, to demonstrate the influence of the model on real‐prevalence estimates. All models were implemented in the freely available R and OpenBUGS software and are presented in Appendix S1. Bayesian models can flexibly take into account the imperfect sensitivity and specificity of the diagnostic test (as well as the influence of pool‐related or external variables) and are therefore the method of choice for calculating population prevalence based on pooled samples. However, when using such complex models, very precise information on test characteristics is needed, which may in general not be available.  相似文献   

17.
Dropouts are common in longitudinal study. If the dropout probability depends on the missing observations at or after dropout, this type of dropout is called informative (or nonignorable) dropout (ID). Failure to accommodate such dropout mechanism into the model will bias the parameter estimates. We propose a conditional autoregressive model for longitudinal binary data with an ID model such that the probabilities of positive outcomes as well as the drop‐out indicator in each occasion are logit linear in some covariates and outcomes. This model adopting a marginal model for outcomes and a conditional model for dropouts is called a selection model. To allow for the heterogeneity and clustering effects, the outcome model is extended to incorporate mixture and random effects. Lastly, the model is further extended to a novel model that models the outcome and dropout jointly such that their dependency is formulated through an odds ratio function. Parameters are estimated by a Bayesian approach implemented using the user‐friendly Bayesian software WinBUGS. A methadone clinic dataset is analyzed to illustrate the proposed models. Result shows that the treatment time effect is still significant but weaker after allowing for an ID process in the data. Finally the effect of drop‐out on parameter estimates is evaluated through simulation studies.  相似文献   

18.
Proactive conservation planning for species requires the identification of important spatial attributes across ecologically relevant scales in a model-based framework. However, it is often difficult to develop predictive models, as the explanatory data required for model development across regional management scales is rarely available. Golden eagles are a large-ranging predator of conservation concern in the United States that may be negatively affected by wind energy development. Thus, identifying landscapes least likely to pose conflict between eagles and wind development via shared space prior to development will be critical for conserving populations in the face of imposing development. We used publically available data on golden eagle nests to generate predictive models of golden eagle nesting sites in Wyoming, USA, using a suite of environmental and anthropogenic variables. By overlaying predictive models of golden eagle nesting habitat with wind energy resource maps, we highlight areas of potential conflict among eagle nesting habitat and wind development. However, our results suggest that wind potential and the relative probability of golden eagle nesting are not necessarily spatially correlated. Indeed, the majority of our sample frame includes areas with disparate predictions between suitable nesting habitat and potential for developing wind energy resources. Map predictions cannot replace on-the-ground monitoring for potential risk of wind turbines on wildlife populations, though they provide industry and managers a useful framework to first assess potential development.  相似文献   

19.
Ghosh S  Gelfand AE  Zhu K  Clark JS 《Biometrics》2012,68(3):878-885
Summary Many applications involve count data from a process that yields an excess number of zeros. Zero-inflated count models, in particular, zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models, along with Poisson hurdle models, are commonly used to address this problem. However, these models struggle to explain extreme incidence of zeros (say more than 80%), especially to find important covariates. In fact, the ZIP may struggle even when the proportion is not extreme. To redress this problem we propose the class of k-ZIG models. These models allow more flexible modeling of both the zero-inflation and the nonzero counts, allowing interplay between these two components. We develop the properties of this new class of models, including reparameterization to a natural link function. The models are straightforwardly fitted within a Bayesian framework. The methodology is illustrated with simulated data examples as well as a forest seedling dataset obtained from the USDA Forest Service's Forest Inventory and Analysis program.  相似文献   

20.
Bayesian inference is becoming a common statistical approach to phylogenetic estimation because, among other reasons, it allows for rapid analysis of large data sets with complex evolutionary models. Conveniently, Bayesian phylogenetic methods use currently available stochastic models of sequence evolution. However, as with other model-based approaches, the results of Bayesian inference are conditional on the assumed model of evolution: inadequate models (models that poorly fit the data) may result in erroneous inferences. In this article, I present a Bayesian phylogenetic method that evaluates the adequacy of evolutionary models using posterior predictive distributions. By evaluating a model's posterior predictive performance, an adequate model can be selected for a Bayesian phylogenetic study. Although I present a single test statistic that assesses the overall (global) performance of a phylogenetic model, a variety of test statistics can be tailored to evaluate specific features (local performance) of evolutionary models to identify sources failure. The method presented here, unlike the likelihood-ratio test and parametric bootstrap, accounts for uncertainty in the phylogeny and model parameters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号