首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
Zhao JX  Foulkes AS  George EI 《Biometrics》2005,61(2):591-599
Characterizing the process by which molecular and cellular level changes occur over time will have broad implications for clinical decision making and help further our knowledge of disease etiology across many complex diseases. However, this presents an analytic challenge due to the large number of potentially relevant biomarkers and the complex, uncharacterized relationships among them. We propose an exploratory Bayesian model selection procedure that searches for model simplicity through independence testing of multiple discrete biomarkers measured over time. Bayes factor calculations are used to identify and compare models that are best supported by the data. For large model spaces, i.e., a large number of multi-leveled biomarkers, we propose a Markov chain Monte Carlo (MCMC) stochastic search algorithm for finding promising models. We apply our procedure to explore the extent to which HIV-1 genetic changes occur independently over time.  相似文献   

2.
Li Y  Wileyto EP  Heitjan DF 《Biometrics》2011,67(4):1321-1329
In smoking cessation clinical trials, subjects commonly receive treatment and report daily cigarette consumption over a period of several weeks. Although the outcome at the end of this period is an important indicator of treatment success, substantial uncertainty remains on how an individual's smoking behavior will evolve over time. Therefore it is of interest to predict long-term smoking cessation success based on short-term clinical observations. We develop a Bayesian method for prediction, based on a cure-mixture frailty model we proposed earlier, that describes the process of transition between abstinence and smoking. Specifically we propose a two-stage prediction algorithm that first uses importance sampling to generate subject-specific frailties from their posterior distributions conditional on the observed data, then samples predicted future smoking behavior trajectories from the estimated model parameters and sampled frailties. We apply the method to data from two randomized smoking cessation trials comparing bupropion to placebo. Comparisons of actual smoking status at one year with predictions from our model and from a variety of empirical methods suggest that our method gives excellent predictions.  相似文献   

3.
A two-component model for counts of infectious diseases   总被引:1,自引:0,他引:1  
We propose a stochastic model for the analysis of time series of disease counts as collected in typical surveillance systems on notifiable infectious diseases. The model is based on a Poisson or negative binomial observation model with two components: a parameter-driven component relates the disease incidence to latent parameters describing endemic seasonal patterns, which are typical for infectious disease surveillance data. An observation-driven or epidemic component is modeled with an autoregression on the number of cases at the previous time points. The autoregressive parameter is allowed to change over time according to a Bayesian changepoint model with unknown number of changepoints. Parameter estimates are obtained through the Bayesian model averaging using Markov chain Monte Carlo techniques. We illustrate our approach through analysis of simulated data and real notification data obtained from the German infectious disease surveillance system, administered by the Robert Koch Institute in Berlin. Software to fit the proposed model can be obtained from http://www.statistik.lmu.de/ approximately mhofmann/twins.  相似文献   

4.
Bobb JF  Dominici F  Peng RD 《Biometrics》2011,67(4):1605-1616
Estimating the risks heat waves pose to human health is a critical part of assessing the future impact of climate change. In this article, we propose a flexible class of time series models to estimate the relative risk of mortality associated with heat waves and conduct Bayesian model averaging (BMA) to account for the multiplicity of potential models. Applying these methods to data from 105 U.S. cities for the period 1987-2005, we identify those cities having a high posterior probability of increased mortality risk during heat waves, examine the heterogeneity of the posterior distributions of mortality risk across cities, assess sensitivity of the results to the selection of prior distributions, and compare our BMA results to a model selection approach. Our results show that no single model best predicts risk across the majority of cities, and that for some cities heat-wave risk estimation is sensitive to model choice. Although model averaging leads to posterior distributions with increased variance as compared to statistical inference conditional on a model obtained through model selection, we find that the posterior mean of heat wave mortality risk is robust to accounting for model uncertainty over a broad class of models.  相似文献   

5.
Often there is substantial uncertainty in the selection of confounderswhen estimating the association between an exposure and health.We define this type of uncertainty as `adjustment uncertainty'.We propose a general statistical framework for handling adjustmentuncertainty in exposure effect estimation for a large numberof confounders, we describe a specific implementation, and wedevelop associated visualization tools. Theoretical resultsand simulation studies show that the proposed method providesconsistent estimators of the exposure effect and its variance.We also show that, when the goal is to estimate an exposureeffect accounting for adjustment uncertainty, Bayesian modelaveraging with posterior model probabilities approximated usinginformation criteria can fail to estimate the exposure effectand can over- or underestimate its variance. We compare ourapproach to Bayesian model averaging using time series dataon levels of fine particulate matter and mortality.  相似文献   

6.
Disease incidence or mortality data are typically available as rates or counts for specified regions, collected over time. We propose Bayesian nonparametric spatial modeling approaches to analyze such data. We develop a hierarchical specification using spatial random effects modeled with a Dirichlet process prior. The Dirichlet process is centered around a multivariate normal distribution. This latter distribution arises from a log-Gaussian process model that provides a latent incidence rate surface, followed by block averaging to the areal units determined by the regions in the study. With regard to the resulting posterior predictive inference, the modeling approach is shown to be equivalent to an approach based on block averaging of a spatial Dirichlet process to obtain a prior probability model for the finite dimensional distribution of the spatial random effects. We introduce a dynamic formulation for the spatial random effects to extend the model to spatio-temporal settings. Posterior inference is implemented through Gibbs sampling. We illustrate the methodology with simulated data as well as with a data set on lung cancer incidences for all 88 counties in the state of Ohio over an observation period of 21 years.  相似文献   

7.
Wu CH  Drummond AJ 《Genetics》2011,188(1):151-164
We provide a framework for Bayesian coalescent inference from microsatellite data that enables inference of population history parameters averaged over microsatellite mutation models. To achieve this we first implemented a rich family of microsatellite mutation models and related components in the software package BEAST. BEAST is a powerful tool that performs Bayesian MCMC analysis on molecular data to make coalescent and evolutionary inferences. Our implementation permits the application of existing nonparametric methods to microsatellite data. The implemented microsatellite models are based on the replication slippage mechanism and focus on three properties of microsatellite mutation: length dependency of mutation rate, mutational bias toward expansion or contraction, and number of repeat units changed in a single mutation event. We develop a new model that facilitates microsatellite model averaging and Bayesian model selection by transdimensional MCMC. With Bayesian model averaging, the posterior distributions of population history parameters are integrated across a set of microsatellite models and thus account for model uncertainty. Simulated data are used to evaluate our method in terms of accuracy and precision of estimation and also identification of the true mutation model. Finally we apply our method to a red colobus monkey data set as an example.  相似文献   

8.
Tang Y  Ghosal S  Roy A 《Biometrics》2007,63(4):1126-1134
We propose a Dirichlet process mixture model (DPMM) for the P-value distribution in a multiple testing problem. The DPMM allows us to obtain posterior estimates of quantities such as the proportion of true null hypothesis and the probability of rejection of a single hypothesis. We describe a Markov chain Monte Carlo algorithm for computing the posterior and the posterior estimates. We propose an estimator of the positive false discovery rate based on these posterior estimates and investigate the performance of the proposed estimator via simulation. We also apply our methodology to analyze a leukemia data set.  相似文献   

9.
In this article, we develop a latent class model with class probabilities that depend on subject-specific covariates. One of our major goals is to identify important predictors of latent classes. We consider methodology that allows estimation of latent classes while allowing for variable selection uncertainty. We propose a Bayesian variable selection approach and implement a stochastic search Gibbs sampler for posterior computation to obtain model-averaged estimates of quantities of interest such as marginal inclusion probabilities of predictors. Our methods are illustrated through simulation studies and application to data on weight gain during pregnancy, where it is of interest to identify important predictors of latent weight gain classes.  相似文献   

10.
We propose a Bayesian hypothesis testing procedure for comparing the distributions of paired samples. The procedure is based on a flexible model for the joint distribution of both samples. The flexibility is given by a mixture of Dirichlet processes. Our proposal uses a spike-slab prior specification for the base measure of the Dirichlet process and a particular parametrization for the kernel of the mixture in order to facilitate comparisons and posterior inference. The joint model allows us to derive the marginal distributions and test whether they differ or not. The procedure exploits the correlation between samples, relaxes the parametric assumptions, and detects possible differences throughout the entire distributions. A Monte Carlo simulation study comparing the performance of this strategy to other traditional alternatives is provided. Finally, we apply the proposed approach to spirometry data collected in the United States to investigate changes in pulmonary function in children and adolescents in response to air polluting factors.  相似文献   

11.
We present Bayesian hierarchical models for the analysis of Affymetrix GeneChip data. The approach we take differs from other available approaches in two fundamental aspects. Firstly, we aim to integrate all processing steps of the raw data in a common statistically coherent framework, allowing all components and thus associated errors to be considered simultaneously. Secondly, inference is based on the full posterior distribution of gene expression indices and derived quantities, such as fold changes or ranks, rather than on single point estimates. Measures of uncertainty on these quantities are thus available. The models presented represent the first building block for integrated Bayesian Analysis of Affymetrix GeneChip data: the models take into account additive as well as multiplicative error, gene expression levels are estimated using perfect match and a fraction of mismatch probes and are modeled on the log scale. Background correction is incorporated by modeling true signal and cross-hybridization explicitly, and a need for further normalization is considerably reduced by allowing for array-specific distributions of nonspecific hybridization. When replicate arrays are available for a condition, posterior distributions of condition-specific gene expression indices are estimated directly, by a simultaneous consideration of replicate probe sets, avoiding averaging over estimates obtained from individual replicate arrays. The performance of the Bayesian model is compared to that of standard available point estimate methods on subsets of the well known GeneLogic and Affymetrix spike-in data. The Bayesian model is found to perform well and the integrated procedure presented appears to hold considerable promise for further development.  相似文献   

12.
Array-based comparative genomic hybridization (array-CGH) is a high throughput, high resolution technique for studying the genetics of cancer. Analysis of array-CGH data typically involves estimation of the underlying chromosome copy numbers from the log fluorescence ratios and segmenting the chromosome into regions with the same copy number at each location. We propose for the analysis of array-CGH data, a new stochastic segmentation model and an associated estimation procedure that has attractive statistical and computational properties. An important benefit of this Bayesian segmentation model is that it yields explicit formulas for posterior means, which can be used to estimate the signal directly without performing segmentation. Other quantities relating to the posterior distribution that are useful for providing confidence assessments of any given segmentation can also be estimated by using our method. We propose an approximation method whose computation time is linear in sequence length which makes our method practically applicable to the new higher density arrays. Simulation studies and applications to real array-CGH data illustrate the advantages of the proposed approach.  相似文献   

13.
We introduce the Bayesian skyline plot, a new method for estimating past population dynamics through time from a sample of molecular sequences without dependence on a prespecified parametric model of demographic history. We describe a Markov chain Monte Carlo sampling procedure that efficiently samples a variant of the generalized skyline plot, given sequence data, and combines these plots to generate a posterior distribution of effective population size through time. We apply the Bayesian skyline plot to simulated data sets and show that it correctly reconstructs demographic history under canonical scenarios. Finally, we compare the Bayesian skyline plot model to previous coalescent approaches by analyzing two real data sets (hepatitis C virus in Egypt and mitochondrial DNA of Beringian bison) that have been previously investigated using alternative coalescent methods. In the bison analysis, we detect a severe but previously unrecognized bottleneck, estimated to have occurred 10,000 radiocarbon years ago, which coincides with both the earliest undisputed record of large numbers of humans in Alaska and the megafaunal extinctions in North America at the beginning of the Holocene.  相似文献   

14.
Abstract.— The importance of accommodating the phylogenetic history of a group when performing a comparative analysis is now widely recognized. The typical approaches either assume the tree is known without error, or they base inferences on a collection of well-supported trees or on a collection of trees generated under a stochastic model of cladogenesis. However, these approaches do not adequately account for the uncertainty of phylogenetic trees in a comparative analysis, especially when data relevant to the phylogeny of a group are available. Here, we develop a method for performing comparative analyses that is based on an extension of Felsenstein's independent contrasts method. Uncertainties in the phylogeny, branch lengths, and other parameters are accommodated by averaging over all possible trees, weighting each by the probability that the tree is correct. We do this in a Bayesian framework and use Markov chain Monte Carlo to perform the high-dimensional summations and integrations required by the analysis. We illustrate the method using comparative characters sampled from Anolis lizards.  相似文献   

15.
Adrian E. Raftery  Le Bao 《Biometrics》2010,66(4):1162-1173
Summary The Joint United Nations Programme on HIV/AIDS (UNAIDS) has decided to use Bayesian melding as the basis for its probabilistic projections of HIV prevalence in countries with generalized epidemics. This combines a mechanistic epidemiological model, prevalence data, and expert opinion. Initially, the posterior distribution was approximated by sampling‐importance‐resampling, which is simple to implement, easy to interpret, transparent to users, and gave acceptable results for most countries. For some countries, however, this is not computationally efficient because the posterior distribution tends to be concentrated around nonlinear ridges and can also be multimodal. We propose instead incremental mixture importance sampling (IMIS), which iteratively builds up a better importance sampling function. This retains the simplicity and transparency of sampling importance resampling, but is much more efficient computationally. It also leads to a simple estimator of the integrated likelihood that is the basis for Bayesian model comparison and model averaging. In simulation experiments and on real data, it outperformed both sampling importance resampling and three publicly available generic Markov chain Monte Carlo algorithms for this kind of problem.  相似文献   

16.
Prior specification is an essential component of parameter estimation and model comparison in Approximate Bayesian computation (ABC). Oaks et al. present a simulation‐based power analysis of msBayes and conclude that msBayes has low power to detect genuinely random divergence times across taxa, and suggest the cause is Lindley's paradox. Although the predictions are similar, we show that their findings are more fundamentally explained by insufficient prior sampling that arises with poorly chosen wide priors that critically undersample nonsimultaneous divergence histories of high likelihood. In a reanalysis of their data on Philippine Island vertebrates, we show how this problem can be circumvented by expanding upon a previously developed procedure that accommodates uncertainty in prior selection using Bayesian model averaging. When these procedures are used, msBayes supports recent divergences without support for synchronous divergence in the Oaks et al. data and we further present a simulation analysis that demonstrates that msBayes can have high power to detect asynchronous divergence under narrower priors for divergence time. Our findings highlight the need for exploration of plausible parameter space and prior sampling efficiency for ABC samplers in high dimensions. We discus potential improvements to msBayes and conclude that when used appropriately with model averaging, msBayes remains an effective and powerful tool.  相似文献   

17.
Brent A Coull 《Biometrics》2011,67(2):486-494
Summary In many biomedical investigations, a primary goal is the identification of subjects who are susceptible to a given exposure or treatment of interest. We focus on methods for addressing this question in longitudinal studies when interest focuses on relating susceptibility to a subject's baseline or mean outcome level. In this context, we propose a random intercepts–functional slopes model that relaxes the assumption of linear association between random coefficients in existing mixed models and yields an estimate of the functional form of this relationship. We propose a penalized spline formulation for the nonparametric function that represents this relationship, and implement a fully Bayesian approach to model fitting. We investigate the frequentist performance of our method via simulation, and apply the model to data on the effects of particulate matter on coronary blood flow from an animal toxicology study. The general principles introduced here apply more broadly to settings in which interest focuses on the relationship between baseline and change over time.  相似文献   

18.
MOTIVATION: Selecting a small number of relevant genes for accurate classification of samples is essential for the development of diagnostic tests. We present the Bayesian model averaging (BMA) method for gene selection and classification of microarray data. Typical gene selection and classification procedures ignore model uncertainty and use a single set of relevant genes (model) to predict the class. BMA accounts for the uncertainty about the best set to choose by averaging over multiple models (sets of potentially overlapping relevant genes). RESULTS: We have shown that BMA selects smaller numbers of relevant genes (compared with other methods) and achieves a high prediction accuracy on three microarray datasets. Our BMA algorithm is applicable to microarray datasets with any number of classes, and outputs posterior probabilities for the selected genes and models. Our selected models typically consist of only a few genes. The combination of high accuracy, small numbers of genes and posterior probabilities for the predictions should make BMA a powerful tool for developing diagnostics from expression data. AVAILABILITY: The source codes and datasets used are available from our Supplementary website.  相似文献   

19.
We conducted a simulation study to compare two methods that have been recently used in clinical literature for the dynamic prediction of time to pregnancy. The first is landmarking, a semi-parametric method where predictions are updated as time progresses using the patient subset still at risk at that time point. The second is the beta-geometric model that updates predictions over time from a parametric model estimated on all data and is specific to applications with a discrete time to event outcome. The beta-geometric model introduces unobserved heterogeneity by modelling the chance of an event per discrete time unit according to a beta distribution. Due to selection of patients with lower chances as time progresses, the predicted probability of an event decreases over time. Both methods were recently used to develop models predicting the chance to conceive naturally. The advantages, disadvantages and accuracy of these two methods are unknown. We simulated time-to-pregnancy data according to different scenarios. We then compared the two methods by the following out-of-sample metrics: bias and root mean squared error in the average prediction, root mean squared error in individual predictions, Brier score and c statistic. We consider different scenarios including data-generating mechanisms for which the models are misspecified. We applied the two methods on a clinical dataset comprising 4999 couples. Finally, we discuss the pros and cons of the two methods based on our results and present recommendations for use of either of the methods in different settings and (effective) sample sizes.  相似文献   

20.
Tropical forests play a critical role in carbon and water cycles at a global scale. Rapid climate change is anticipated in tropical regions over the coming decades and, under a warmer and drier climate, tropical forests are likely to be net sources of carbon rather than sinks. However, our understanding of tropical forest response and feedback to climate change is very limited. Efforts to model climate change impacts on carbon fluxes in tropical forests have not reached a consensus. Here, we use the Ecosystem Demography model (ED2) to predict carbon fluxes of a Puerto Rican tropical forest under realistic climate change scenarios. We parameterized ED2 with species‐specific tree physiological data using the Predictive Ecosystem Analyzer workflow and projected the fate of this ecosystem under five future climate scenarios. The model successfully captured interannual variability in the dynamics of this tropical forest. Model predictions closely followed observed values across a wide range of metrics including aboveground biomass, tree diameter growth, tree size class distributions, and leaf area index. Under a future warming and drying climate scenario, the model predicted reductions in carbon storage and tree growth, together with large shifts in forest community composition and structure. Such rapid changes in climate led the forest to transition from a sink to a source of carbon. Growth respiration and root allocation parameters were responsible for the highest fraction of predictive uncertainty in modeled biomass, highlighting the need to target these processes in future data collection. Our study is the first effort to rely on Bayesian model calibration and synthesis to elucidate the key physiological parameters that drive uncertainty in tropical forests responses to climatic change. We propose a new path forward for model‐data synthesis that can substantially reduce uncertainty in our ability to model tropical forest responses to future climate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号