首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Nathan P. Lemoine 《Oikos》2019,128(7):912-928
Throughout the last two decades, Bayesian statistical methods have proliferated throughout ecology and evolution. Numerous previous references established both philosophical and computational guidelines for implementing Bayesian methods. However, protocols for incorporating prior information, the defining characteristic of Bayesian philosophy, are nearly nonexistent in the ecological literature. Here, I hope to encourage the use of weakly informative priors in ecology and evolution by providing a ‘consumer's guide’ to weakly informative priors. The first section outlines three reasons why ecologists should abandon noninformative priors: 1) common flat priors are not always noninformative, 2) noninformative priors provide the same result as simpler frequentist methods, and 3) noninformative priors suffer from the same high type I and type M error rates as frequentist methods. The second section provides a guide for implementing informative priors, wherein I detail convenient ‘reference’ prior distributions for common statistical models (i.e. regression, ANOVA, hierarchical models). I then use simulations to visually demonstrate how informative priors influence posterior parameter estimates. With the guidelines provided here, I hope to encourage the use of weakly informative priors for Bayesian analyses in ecology. Ecologists can and should debate the appropriate form of prior information, but should consider weakly informative priors as the new ‘default’ prior for any Bayesian model.  相似文献   

2.
The ecological study design suffers from a broad range of biases that result from the loss of information regarding the joint distribution of individual-level outcomes, exposures, and confounders. The consequent nonidentifiability of individual-level models cannot be overcome without additional information; we combine ecological data with a sample of individual-level case-control data. The focus of this article is hierarchical models to account for between-group heterogeneity. Estimation and inference pose serious computational challenges. We present a Bayesian implementation based on a data augmentation scheme where the unobserved data are treated as auxiliary variables. The methods are illustrated with a dataset of county-specific infant mortality data from the state of North Carolina.  相似文献   

3.
A common concern in Bayesian data analysis is that an inappropriately informative prior may unduly influence posterior inferences. In the context of Bayesian clinical trial design, well chosen priors are important to ensure that posterior-based decision rules have good frequentist properties. However, it is difficult to quantify prior information in all but the most stylized models. This issue may be addressed by quantifying the prior information in terms of a number of hypothetical patients, i.e., a prior effective sample size (ESS). Prior ESS provides a useful tool for understanding the impact of prior assumptions. For example, the prior ESS may be used to guide calibration of prior variances and other hyperprior parameters. In this paper, we discuss such prior sensitivity analyses by using a recently proposed method to compute a prior ESS. We apply this in several typical settings of Bayesian biomedical data analysis and clinical trial design. The data analyses include cross-tabulated counts, multiple correlated diagnostic tests, and ordinal outcomes using a proportional-odds model. The study designs include a phase I trial with late-onset toxicities, a phase II trial that monitors event times, and a phase I/II trial with dose-finding based on efficacy and toxicity.  相似文献   

4.
In situ air sparging (IAS) pilot test procedures have been developed that provide rapid, on-site information about IAS performance. The standard pilot test consists of six activities conducted to look for indicators of infeasibility and to characterize the air distribution to the extent necessary to make design decisions about IAS well placement. In addition, safety hazards that need to be addressed prior to full-scale design are identified. Two additional pilot test activities are described in those cases where air distribution must be more precisely defined. The test activities include both chemical tests (tracking contaminant concentrations, dissolved oxygen and tracers) and physical tests (air flow rate and injection pressure, groundwater pressure response). Pilot test data from Eielson Air Force Base, Alaska illustrates implementation of the pilot test and interpretation of the data.  相似文献   

5.
The Bayesian method of phylogenetic inference often produces high posterior probabilities (PPs) for trees or clades, even when the trees are clearly incorrect. The problem appears to be mainly due to large sizes of molecular datasets and to the large-sample properties of Bayesian model selection and its sensitivity to the prior when several of the models under comparison are nearly equally correct (or nearly equally wrong) and are of the same dimension. A previous suggestion to alleviate the problem is to let the internal branch lengths in the tree become increasingly small in the prior with the increase in the data size so that the bifurcating trees are increasingly star-like. In particular, if the internal branch lengths are assigned the exponential prior, the prior mean mu0 should approach zero faster than 1/square root n but more slowly than 1/n, where n is the sequence length. This paper examines the usefulness of this data size-dependent prior using a dataset of the mitochondrial protein-coding genes from the baleen whales, with the prior mean fixed at mu0=0.1n(-2/3). In this dataset, phylogeny reconstruction is sensitive to the assumed evolutionary model, species sampling and the type of data (DNA or protein sequences), but Bayesian inference using the default prior attaches high PPs for conflicting phylogenetic relationships. The data size-dependent prior alleviates the problem to some extent, giving weaker support for unstable relationships. This prior may be useful in reducing apparent conflicts in the results of Bayesian analysis or in making the method less sensitive to model violations.  相似文献   

6.
When a dataset is imbalanced, the prediction of the scarcely-sampled subpopulation can be over-influenced by the population contributing to the majority of the data. The aim of this study was to develop a Bayesian modelling approach with balancing informative prior so that the influence of imbalance to the overall prediction could be minimised. The new approach was developed in order to weigh the data in favour of the smaller subset(s). The method was assessed in terms of bias and precision in predicting model parameter estimates of simulated datasets. Moreover, the method was evaluated in predicting optimal dose levels of tobramycin for various age groups in a motivating example. The bias estimates using the balancing informative prior approach were smaller than those generated using the conventional approach which was without the consideration for the imbalance in the datasets. The precision estimates were also superior. The method was further evaluated in a motivating example of optimal dosage prediction of tobramycin. The resulting predictions also agreed well with what had been reported in the literature. The proposed Bayesian balancing informative prior approach has shown a real potential to adequately weigh the data in favour of smaller subset(s) of data to generate robust prediction models.  相似文献   

7.
ABSTRACT: BACKGROUND: Inference about regulatory networks from high-throughput genomics data is of great interest in systems biology. We present a Bayesian approach to infer gene regulatory networks from time series expression data by integrating various types of biological knowledge. RESULTS: We formulate network construction as a series of variable selection problems and use linear regression to model the data. Our method summarizes additional data sources with an informative prior probability distribution over candidate regression models. We extend the Bayesian model averaging (BMA) variable selection method to select regulators in the regression framework. We summarize the external biological knowledge by an informative prior probability distribution over the candidate regression models. CONCLUSIONS: We demonstrate our method on simulated data and a set of time-series microarray experiments measuring the effect of a drug perturbation on gene expression levels, and show that it outperforms leading regression-based methods in the literature.  相似文献   

8.
Bayesian extrapolation of space-time trends in cancer registry data   总被引:1,自引:0,他引:1  
Schmid V  Held L 《Biometrics》2004,60(4):1034-1042
We apply a full Bayesian model framework to a dataset on stomach cancer mortality in West Germany. The data are stratified by age group, year, and district. Using an age-period-cohort model with an additional spatial component, our goal is to investigate whether there is evidence for space-time interactions in these data. Furthermore, we will determine whether a period-space or a cohort-space interaction model is more appropriate to predict future mortality rates. The setup will be fully Bayesian based on a series of Gaussian Markov random field priors for each of the components. Statistical inference is based on efficient algorithms to block update Gaussian Markov random fields, which have recently been proposed in the literature.  相似文献   

9.
Chen MH  Ibrahim JG 《Biometrics》2000,56(3):678-685
Correlated count data arise often in practice, especially in repeated measures situations or instances in which observations are collected over time. In this paper, we consider a parametric model for a time series of counts by constructing a likelihood-based version of a model similar to that of Zeger (1988, Biometrika 75, 621-629). The model has the advantage of incorporating both overdispersion and autocorrelation. We consider a Bayesian approach and propose a class of informative prior distributions for the model parameters that are useful for prediction. The prior specification is motivated from the notion of the existence of data from similar previous studies, called historical data, which is then quantified into a prior distribution for the current study. We derive the Bayesian predictive distribution and use a Bayesian criterion, called the predictive L measure, for assessing the predictions for a given time series model. The distribution of the predictive L measure is also derived, which will enable us to compare the predictive ability for each model under consideration. Our methodology is motivated by a real data set involving yearly pollen counts, which is examined in some detail.  相似文献   

10.
Despite benefits for precision, ecologists rarely use informative priors. One reason that ecologists may prefer vague priors is the perception that informative priors reduce accuracy. To date, no ecological study has empirically evaluated data‐derived informative priors' effects on precision and accuracy. To determine the impacts of priors, we evaluated mortality models for tree species using data from a forest dynamics plot in Thailand. Half the models used vague priors, and the remaining half had informative priors. We found precision was greater when using informative priors, but effects on accuracy were more variable. In some cases, prior information improved accuracy, while in others, it was reduced. On average, models with informative priors were no more or less accurate than models without. Our analyses provide a detailed case study on the simultaneous effect of prior information on precision and accuracy and demonstrate that when priors are specified appropriately, they lead to greater precision without systematically reducing model accuracy.  相似文献   

11.
1. Informative Bayesian priors can improve the precision of estimates in ecological studies or estimate parameters for which little or no information is available. While Bayesian analyses are becoming more popular in ecology, the use of strongly informative priors remains rare, perhaps because examples of informative priors are not readily available in the published literature. 2. Dispersal distance is an important ecological parameter, but is difficult to measure and estimates are scarce. General models that provide informative prior estimates of dispersal distances will therefore be valuable. 3. Using a world-wide data set on birds, we develop a predictive model of median natal dispersal distance that includes body mass, wingspan, sex and feeding guild. This model predicts median dispersal distance well when using the fitted data and an independent test data set, explaining up to 53% of the variation. 4. Using this model, we predict a priori estimates of median dispersal distance for 57 woodland-dependent bird species in northern Victoria, Australia. These estimates are then used to investigate the relationship between dispersal ability and vulnerability to landscape-scale changes in habitat cover and fragmentation. 5. We find evidence that woodland bird species with poor predicted dispersal ability are more vulnerable to habitat fragmentation than those species with longer predicted dispersal distances, thus improving the understanding of this important phenomenon. 6. The value of constructing informative priors from existing information is also demonstrated. When used as informative priors for four example species, predicted dispersal distances reduced the 95% credible intervals of posterior estimates of dispersal distance by 8-19%. Further, should we have wished to collect information on avian dispersal distances and relate it to species' responses to habitat loss and fragmentation, data from 221 individuals across 57 species would have been required to obtain estimates with the same precision as those provided by the general model.  相似文献   

12.
French JL  Ibrahim JG 《Biometrics》2002,58(4):906-916
The objective of a chronic rodent bioassay is to assess the impact of a chemical compound on the development of tumors. However, most tumor types are not observable prior to necropsy, making direct estimation of the tumor incidence rate problematic. In such cases, estimation can proceed only if the study incorporates multiple interim sacrifices or we make use of simplified parametric or nonparametric models. In addition, it is widely accepted that other factors, such as weight, can be related to both dose level and tumor onset, confounding the association of interest. However, there is not typically enough information in the current study to assess such effects. The addition of historical data can help alleviate this problem. In this article, we propose a novel Bayesian semiparametric model for the analysis of data from rodent carcinogenicity studies. We develop informative prior distributions for covariate effects through the use of historical control data and outline a Gibbs sampling scheme. We implement the model by analyzing data from a National Toxicology Program chronic rodent bioassay.  相似文献   

13.
This Genetic Analysis Workshop 13 contribution presents a linkage analysis of hypertension in the Framingham data based on the posterior probability of linkage, or PPL. We dichotomized the phenotype, coding individuals who had been treated for hypertension at any time, as well as those with repeated high blood pressure measurements, as affected. Here we use a new variation on the multipoint PPL that incorporates integration over the genetic model. PPLs were computed for chromosomes 1 through 5, 11, 14, and 17 and remained below the 2% assumed prior probability of linkage for 73% of the locations examined. The maximum PPL of 4.5% was obtained on chromosome 1 at 178 cM. Although this is more than twice the assumed prior probability of linkage, it is well below a level at which we would recommend committing substantial additional resources to molecular follow-up. While the PPL analysis of this data remains inconclusive, Bayesian methodology gives us a clear mechanism for using the information gained here in further studies.  相似文献   

14.
The restricted mean survival time (RMST) evaluates the expectation of survival time truncated by a prespecified time point, because the mean survival time in the presence of censoring is typically not estimable. The frequentist inference procedure for RMST has been widely advocated for comparison of two survival curves, while research from the Bayesian perspective is rather limited. For the RMST of both right- and interval-censored data, we propose Bayesian nonparametric estimation and inference procedures. By assigning a mixture of Dirichlet processes (MDP) prior to the distribution function, we can estimate the posterior distribution of RMST. We also explore another Bayesian nonparametric approach using the Dirichlet process mixture model and make comparisons with the frequentist nonparametric method. Simulation studies demonstrate that the Bayesian nonparametric RMST under diffuse MDP priors leads to robust estimation and under informative priors it can incorporate prior knowledge into the nonparametric estimator. Analysis of real trial examples demonstrates the flexibility and interpretability of the Bayesian nonparametric RMST for both right- and interval-censored data.  相似文献   

15.
Individualized anatomical information has been used as prior knowledge in Bayesian inference paradigms of whole-brain network models. However, the actual sensitivity to such personalized information in priors is still unknown. In this study, we introduce the use of fully Bayesian information criteria and leave-one-out cross-validation technique on the subject-specific information to assess different epileptogenicity hypotheses regarding the location of pathological brain areas based on a priori knowledge from dynamical system properties. The Bayesian Virtual Epileptic Patient (BVEP) model, which relies on the fusion of structural data of individuals, a generative model of epileptiform discharges, and a self-tuning Monte Carlo sampling algorithm, is used to infer the spatial map of epileptogenicity across different brain areas. Our results indicate that measuring the out-of-sample prediction accuracy of the BVEP model with informative priors enables reliable and efficient evaluation of potential hypotheses regarding the degree of epileptogenicity across different brain regions. In contrast, while using uninformative priors, the information criteria are unable to provide strong evidence about the epileptogenicity of brain areas. We also show that the fully Bayesian criteria correctly assess different hypotheses about both structural and functional components of whole-brain models that differ across individuals. The fully Bayesian information-theory based approach used in this study suggests a patient-specific strategy for epileptogenicity hypothesis testing in generative brain network models of epilepsy to improve surgical outcomes.  相似文献   

16.
Wildfires are impactful natural disasters, creating a significant impact across many rural communities. Predicting wildfire probability provides authorities with invaluable information to take preventive measures at the early stages. This study establishes Bayesian modelling for predicting the wildfire event probability based on a set of environmental predictors and forest vulnerability, represented by the normalized difference vegetation index. Prior information about the impact of these predictors on the likelihood of wildfire is available in the reports on the past major wildfire events. In that sense, the use of prior information in the Bayesian models has the potential to provide accurate predictions for the wildfire probability. Moreover, the relationship between the predictors creates mediating effects on the likelihood of a wildfire event. A multivariate prior distribution in the Bayesian modelling can capture the mediating effects. In this study, Bayesian models with informative and noninformative priors are considered with independent and multivariate prior distributions to utilize the available prior information and handle the mediating effects between the predictors using the normalized difference vegetation index data provided by Google Earth Engine. Nine years of data were gathered across 9841 sampled areas in a forested land of Australia. Modelling results concluded that forest vulnerability is found to be the dominant predictor of wildfire probability. This modelling can help create a Wildfire Warning Index based on climate data and forest vulnerability measurements, enabling preventative actions in high-risk and targeted areas.  相似文献   

17.
We consider the problem of estimating a population size by removal sampling when the sampling rate is unknown. Bayesian methods are now widespread and allow to include prior knowledge in the analysis. However, we show that Bayes estimates based on default improper priors lead to improper posteriors or infinite estimates. Similarly, weakly informative priors give unstable estimators that are sensitive to the choice of hyperparameters. By examining the likelihood, we show that population size estimates can be stabilized by penalizing small values of the sampling rate or large value of the population size. Based on theoretical results and simulation studies, we propose some recommendations on the choice of the prior. Then, we applied our results to real datasets.  相似文献   

18.
Bayesian inference of mixed models in quantitative genetics of crop species   总被引:1,自引:0,他引:1  
The objectives of this study were to implement a Bayesian framework for mixed models analysis in crop species breeding and to exploit alternatives for informative prior elicitation. Bayesian inference for genetic evaluation in annual crop breeding was illustrated with the first two half-sib selection cycles in a popcorn population. The Bayesian framework was based on the Just Another Gibbs Sampler software and the R2jags package. For the first cycle, a non-informative prior for the inverse of the variance components and an informative prior based on meta-analysis were used. For the second cycle, a non-informative prior and an informative prior defined as the posterior from the non-informative and informative analyses of the first cycle were used. Regarding the first cycle, the use of an informative prior from the meta-analysis provided clearly distinct results relative to the analysis with a non-informative prior only for the grain yield. Regarding the second cycle, the results for the expansion volume and grain yield showed differences among the three analyses. The differences between the non-informative and informative prior analyses were restricted to variance components and heritability. The correlations between the predicted breeding values from these analyses were almost perfect.  相似文献   

19.
Agresti A  Min Y 《Biometrics》2005,61(2):515-523
This article investigates the performance, in a frequentist sense, of Bayesian confidence intervals (CIs) for the difference of proportions, relative risk, and odds ratio in 2 x 2 contingency tables. We consider beta priors, logit-normal priors, and related correlated priors for the two binomial parameters. The goal was to analyze whether certain settings for prior parameters tend to provide good coverage performance regardless of the true association parameter values. For the relative risk and odds ratio, we recommend tail intervals over highest posterior density (HPD) intervals, for invariance reasons. To protect against potentially very poor coverage probabilities when the effect is large, it is best to use a diffuse prior, and we recommend the Jeffreys prior. Otherwise, with relatively small samples, Bayesian CIs using more informative (even uniform) priors tend to have poorer performance than the frequentist CIs based on inverting score tests, which perform uniformly quite well for these parameters.  相似文献   

20.
Summary With increasing frequency, epidemiologic studies are addressing hypotheses regarding gene‐environment interaction. In many well‐studied candidate genes and for standard dietary and behavioral epidemiologic exposures, there is often substantial prior information available that may be used to analyze current data as well as for designing a new study. In this article, first, we propose a proper full Bayesian approach for analyzing studies of gene–environment interaction. The Bayesian approach provides a natural way to incorporate uncertainties around the assumption of gene–environment independence, often used in such an analysis. We then consider Bayesian sample size determination criteria for both estimation and hypothesis testing regarding the multiplicative gene–environment interaction parameter. We illustrate our proposed methods using data from a large ongoing case–control study of colorectal cancer investigating the interaction of N‐acetyl transferase type 2 (NAT2) with smoking and red meat consumption. We use the existing data to elicit a design prior and show how to use this information in allocating cases and controls in planning a future study that investigates the same interaction parameters. The Bayesian design and analysis strategies are compared with their corresponding frequentist counterparts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号