首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Detection-nondetection data are often used to investigate species range dynamics using Bayesian occupancy models which rely on the use of Markov chain Monte Carlo (MCMC) methods to sample from the posterior distribution of the parameters of the model. In this article we develop two Variational Bayes (VB) approximations to the posterior distribution of the parameters of a single-season site occupancy model which uses logistic link functions to model the probability of species occurrence at sites and of species detection probabilities. This task is accomplished through the development of iterative algorithms that do not use MCMC methods. Simulations and small practical examples demonstrate the effectiveness of the proposed technique. We specifically show that (under certain circumstances) the variational distributions can provide accurate approximations to the true posterior distributions of the parameters of the model when the number of visits per site (K) are as low as three and that the accuracy of the approximations improves as K increases. We also show that the methodology can be used to obtain the posterior distribution of the predictive distribution of the proportion of sites occupied (PAO).  相似文献   

2.
Xie W  Lewis PO  Fan Y  Kuo L  Chen MH 《Systematic biology》2011,60(2):150-160
The marginal likelihood is commonly used for comparing different evolutionary models in Bayesian phylogenetics and is the central quantity used in computing Bayes Factors for comparing model fit. A popular method for estimating marginal likelihoods, the harmonic mean (HM) method, can be easily computed from the output of a Markov chain Monte Carlo analysis but often greatly overestimates the marginal likelihood. The thermodynamic integration (TI) method is much more accurate than the HM method but requires more computation. In this paper, we introduce a new method, steppingstone sampling (SS), which uses importance sampling to estimate each ratio in a series (the "stepping stones") bridging the posterior and prior distributions. We compare the performance of the SS approach to the TI and HM methods in simulation and using real data. We conclude that the greatly increased accuracy of the SS and TI methods argues for their use instead of the HM method, despite the extra computation needed.  相似文献   

3.
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor.  相似文献   

4.
Ayres KL  Balding DJ 《Genetics》2001,157(1):413-423
We describe a Bayesian approach to analyzing multilocus genotype or haplotype data to assess departures from gametic (linkage) equilibrium. Our approach employs a Markov chain Monte Carlo (MCMC) algorithm to approximate the posterior probability distributions of disequilibrium parameters. The distributions are computed exactly in some simple settings. Among other advantages, posterior distributions can be presented visually, which allows the uncertainties in parameter estimates to be readily assessed. In addition, background knowledge can be incorporated, where available, to improve the precision of inferences. The method is illustrated by application to previously published datasets; implications for multilocus forensic match probabilities and for simple association-based gene mapping are also discussed.  相似文献   

5.
We introduce the Bayesian skyline plot, a new method for estimating past population dynamics through time from a sample of molecular sequences without dependence on a prespecified parametric model of demographic history. We describe a Markov chain Monte Carlo sampling procedure that efficiently samples a variant of the generalized skyline plot, given sequence data, and combines these plots to generate a posterior distribution of effective population size through time. We apply the Bayesian skyline plot to simulated data sets and show that it correctly reconstructs demographic history under canonical scenarios. Finally, we compare the Bayesian skyline plot model to previous coalescent approaches by analyzing two real data sets (hepatitis C virus in Egypt and mitochondrial DNA of Beringian bison) that have been previously investigated using alternative coalescent methods. In the bison analysis, we detect a severe but previously unrecognized bottleneck, estimated to have occurred 10,000 radiocarbon years ago, which coincides with both the earliest undisputed record of large numbers of humans in Alaska and the megafaunal extinctions in North America at the beginning of the Holocene.  相似文献   

6.
A general Bayesian model, Diploffect, is described for estimating the effects of founder haplotypes at quantitative trait loci (QTL) detected in multiparental genetic populations; such populations include the Collaborative Cross (CC), Heterogeneous Socks (HS), and many others for which local genetic variation is well described by an underlying, usually probabilistically inferred, haplotype mosaic. Our aim is to provide a framework for coherent estimation of haplotype and diplotype (haplotype pair) effects that takes into account the following: uncertainty in haplotype composition for each individual; uncertainty arising from small sample sizes and infrequently observed haplotype combinations; possible effects of dominance (for noninbred subjects); genetic background; and that provides a means to incorporate data that may be incomplete or has a hierarchical structure. Using the results of a probabilistic haplotype reconstruction as prior information, we obtain posterior distributions at the QTL for both haplotype effects and haplotype composition. Two alternative computational approaches are supplied: a Markov chain Monte Carlo sampler and a procedure based on importance sampling of integrated nested Laplace approximations. Using simulations of QTL in the incipient CC (pre-CC) and Northport HS populations, we compare the accuracy of Diploffect, approximations to it, and more commonly used approaches based on Haley–Knott regression, describing trade-offs between these methods. We also estimate effects for three QTL previously identified in those populations, obtaining posterior intervals that describe how the phenotype might be affected by diplotype substitutions at the modeled locus.  相似文献   

7.
8.
We describe a Bayesian method for investigating correlated evolution of discrete binary traits on phylogenetic trees. The method fits a continuous-time Markov model to a pair of traits, seeking the best fitting models that describe their joint evolution on a phylogeny. We employ the methodology of reversible-jump (RJ) Markov chain Monte Carlo to search among the large number of possible models, some of which conform to independent evolution of the two traits, others to correlated evolution. The RJ Markov chain visits these models in proportion to their posterior probabilities, thereby directly estimating the support for the hypothesis of correlated evolution. In addition, the RJ Markov chain simultaneously estimates the posterior distributions of the rate parameters of the model of trait evolution. These posterior distributions can be used to test among alternative evolutionary scenarios to explain the observed data. All results are integrated over a sample of phylogenetic trees to account for phylogenetic uncertainty. We implement the method in a program called RJ Discrete and illustrate it by analyzing the question of whether mating system and advertisement of estrus by females have coevolved in the Old World monkeys and great apes.  相似文献   

9.
Salway R  Wakefield J 《Biometrics》2008,64(2):620-626
Summary .   This article considers the modeling of single-dose pharmacokinetic data. Traditionally, so-called compartmental models have been used to analyze such data. Unfortunately, the mean function of such models are sums of exponentials for which inference and computation may not be straightforward. We present an alternative to these models based on generalized linear models, for which desirable statistical properties exist, with a logarithmic link and gamma distribution. The latter has a constant coefficient of variation, which is often appropriate for pharmacokinetic data. Inference is convenient from either a likelihood or a Bayesian perspective. We consider models for both single and multiple individuals, the latter via generalized linear mixed models. For single individuals, Bayesian computation may be carried out with recourse to simulation. We describe a rejection algorithm that, unlike Markov chain Monte Carlo, produces independent samples from the posterior and allows straightforward calculation of Bayes factors for model comparison. We also illustrate how prior distributions may be specified in terms of model-free pharmacokinetic parameters of interest. The methods are applied to data from 12 individuals following administration of the antiasthmatic agent theophylline.  相似文献   

10.
Sun L  Clayton MK 《Biometrics》2008,64(1):74-84
Summary .   We address the development of methods for analyzing crossclassified categorical data that are spatially autocorrelated. We first extend the autologistic model to accommodate two variables. Two bivariate autologistic models are constructed, namely a two-step model and a symmetric model. Importance sampling is used to approximate the complex normalizing factors that arise in these models, and Markov chain Monte Carlo techniques are used to generate simulations of posterior distributions. The resulting models then are expanded to accommodate trend surfaces and directional effects. Simulation studies and real data are used to illustrate this method.  相似文献   

11.
Kitada S  Hayashi T  Kishino H 《Genetics》2000,156(4):2063-2079
We developed an empirical Bayes procedure to estimate genetic distances between populations using allele frequencies. This procedure makes it possible to describe the skewness of the genetic distance while taking full account of the uncertainty of the sample allele frequencies. Dirichlet priors of the allele frequencies are specified, and the posterior distributions of the various composite parameters are obtained by Monte Carlo simulation. To avoid overdependence on subjective priors, we adopt a hierarchical model and estimate hyperparameters by maximizing the joint marginal-likelihood function. Taking advantage of the empirical Bayesian procedure, we extend the method to estimate the effective population size using temporal changes in allele frequencies. The method is applied to data sets on red sea bream, herring, northern pike, and ayu broodstock. It is shown that overdispersion overestimates the genetic distance and underestimates the effective population size, if it is not taken into account during the analysis. The joint marginal-likelihood function also estimates the rate of gene flow into island populations.  相似文献   

12.
罗升  吕强 《生物信息学》2016,14(2):117-122
蛋白质结构预测中,采样是指在构象空间中生成具有最小自由能的状态。传统的采样方法是对自由度直接赋值。这种方法在处理较少的残基时能取得好的效果。但是对于包含100个残基以上的蛋白质结构,由于构象空间的急剧增长,难以得到理想的结构。本文引入深度学习中的HMC(Hybrid Monte Carlo)采样方法,以概率分布为依据对蛋白质的自由度进行采样,能够对包含100、200甚至更多个残基的蛋白质结构进行采样。并且,在采样的过程中加入残基间的距离约束,使得一个结构中,相对于Rosetta的ab initio最多有75%(平均40%)的残基对得到优化,满足距离约束。  相似文献   

13.
The Poisson assumption is popular when data arises in the form of counts. In many applications such counts are fallible. Little research has been done on the Poisson distribution when both false positives and false negatives are present. We present a model in this paper that corrects for misclassification of count data. Bayesian estimators are developed. We provide the actual posterior distributions via integration. Markov Chain Monte Carlo results, which are more convenient for large sample sizes, are utilized for inference.  相似文献   

14.
Bayesian inference is a powerful statistical paradigm that has gained popularity in many fields of science, but adoption has been somewhat slower in biophysics. Here, I provide an accessible tutorial on the use of Bayesian methods by focusing on example applications that will be familiar to biophysicists. I first discuss the goals of Bayesian inference and show simple examples of posterior inference using conjugate priors. I then describe Markov chain Monte Carlo sampling and, in particular, discuss Gibbs sampling and Metropolis random walk algorithms with reference to detailed examples. These Bayesian methods (with the aid of Markov chain Monte Carlo sampling) provide a generalizable way of rigorously addressing parameter inference and identifiability for arbitrarily complicated models.  相似文献   

15.
Friel  Nial; Rue  Havard 《Biometrika》2007,94(3):661-672
We illustrate how the recursive algorithm of Reeves & Pettitt(2004) for general factorizable models can be extended to allowexact sampling, maximization of distributions and computationof marginal distributions. All of the methods we describe applyto discrete-valued Markov random fields with nearest neighbourintegrations defined on regular lattices; in particular we illustratethat exact inference can be performed for hidden autologisticmodels defined on moderately sized lattices. In this contextwe offer an extension of this methodology which allows approximateinference to be carried out for larger lattices without resortingto simulation techniques such as Markov chain Monte Carlo. Inparticular our work offers the basis for an automatic inferencemachine for such models.  相似文献   

16.
Abstract

Taboo-based Monte Carlo search which restricts the sampling of the region near an old configuration, is developed. In this procedure, Monte Carlo simulation and random search method are combined to improve the sampling efficiency. The feasibility of this method is tested on global optimization of a continuous model function, melting of the 256 Lennard-Jones particles at T? = 0.680 and ρ? = 0.850 and polypeptides (alanine dipeptide and Metenkephalin). From the comparison of results for the model function between our method and other methods, we find the increase of convergence rate and the high possibility of escaping from the local energy minima. The results of the Lennard-Jones solids and polypeptides show that the convergence property to reach the equilibrium state is better than that of others. It is also found that no significant bias in ensemble distribution is detected, though taboo-based Monte Carlo search does not sample the correct ensemble distribution owing to the restriction of the sampling of the region near an old configuration.  相似文献   

17.
Using models to simulate and analyze biological networks requires principled approaches to parameter estimation and model discrimination. We use Bayesian and Monte Carlo methods to recover the full probability distributions of free parameters (initial protein concentrations and rate constants) for mass‐action models of receptor‐mediated cell death. The width of the individual parameter distributions is largely determined by non‐identifiability but covariation among parameters, even those that are poorly determined, encodes essential information. Knowledge of joint parameter distributions makes it possible to compute the uncertainty of model‐based predictions whereas ignoring it (e.g., by treating parameters as a simple list of values and variances) yields nonsensical predictions. Computing the Bayes factor from joint distributions yields the odds ratio (~20‐fold) for competing ‘direct’ and ‘indirect’ apoptosis models having different numbers of parameters. Our results illustrate how Bayesian approaches to model calibration and discrimination combined with single‐cell data represent a generally useful and rigorous approach to discriminate between competing hypotheses in the face of parametric and topological uncertainty.  相似文献   

18.
We describe a novel approach to deducing order parameters and correlation times in proteins using a Bayesian statistical method, and show how likelihood contours, P(,S), and confidence levels can be obtained. These results are then compared with those obtained from a simple graphical method, as well as those from Monte Carlo simulations. The Bayes approach has the advantage that it is simple and accurate. Unlike Monte Carlo methods, it gives useful contour plots of probability (also not provided by the simple graphical method), and provides likelihood/confidence information. In addition, the Bayesian approach gives results in very good agreement with those obtained from Monte Carlo simulations, and as such use of Bayesian statistical methods appears to have a promising future for studies of order and dynamics in macromolecules.  相似文献   

19.
Recent advances in big data and analytics research have provided a wealth of large data sets that are too big to be analyzed in their entirety, due to restrictions on computer memory or storage size. New Bayesian methods have been developed for data sets that are large only due to large sample sizes. These methods partition big data sets into subsets and perform independent Bayesian Markov chain Monte Carlo analyses on the subsets. The methods then combine the independent subset posterior samples to estimate a posterior density given the full data set. These approaches were shown to be effective for Bayesian models including logistic regression models, Gaussian mixture models and hierarchical models. Here, we introduce the R package parallelMCMCcombine which carries out four of these techniques for combining independent subset posterior samples. We illustrate each of the methods using a Bayesian logistic regression model for simulation data and a Bayesian Gamma model for real data; we also demonstrate features and capabilities of the R package. The package assumes the user has carried out the Bayesian analysis and has produced the independent subposterior samples outside of the package. The methods are primarily suited to models with unknown parameters of fixed dimension that exist in continuous parameter spaces. We envision this tool will allow researchers to explore the various methods for their specific applications and will assist future progress in this rapidly developing field.  相似文献   

20.
The traditional q1 * methodology for constructing upper confidence limits (UCLs) for the low-dose slopes of quantal dose-response functions has two limitations: (i) it is based on an asymptotic statistical result that has been shown via Monte Carlo simulation not to hold in practice for small, real bioassay experiments (Portier and Hoel, 1983); and (ii) it assumes that the multistage model (which represents cumulative hazard as a polynomial function of dose) is correct. This paper presents an uncertainty analysis approach for fitting dose-response functions to data that does not require specific parametric assumptions or depend on asymptotic results. It has the advantage that the resulting estimates of the dose-response function (and uncertainties about it) no longer depend on the validity of an assumed parametric family nor on the accuracy of the asymptotic approximation. The method derives posterior densities for the true response rates in the dose groups, rather than deriving posterior densities for model parameters, as in other Bayesian approaches (Sielken, 1991), or resampling the observed data points, as in the bootstrap and other resampling methods. It does so by conditioning constrained maximum-entropy priors on the observed data. Monte Carlo sampling of the posterior (constrained, conditioned) probability distributions generate values of response probabilities that might be observed if the experiment were repeated with very large sample sizes. A dose-response curve is fit to each such simulated dataset. If no parametric model has been specified, then a generalized representation (e.g., a power-series or orthonormal polynomial expansion) of the unknown dose-response function is fit to each simulated dataset using “model-free” methods. The simulation-based frequency distribution of all the dose-response curves fit to the simulated datasets yields a posterior distribution function for the low-dose slope of the dose-response curve. An upper confidence limit on the low-dose slope is obtained directly from this posterior distribution. This “Data Cube” procedure is illustrated with a real dataset for benzene, and is seen to produce more policy-relevant insights than does the traditional q1 * methodology. For example, it shows how far apart are the 90%, 95%, and 99% limits and reveals how uncertainty about total and incremental risk vary with dose level (typically being dominated at low doses by uncertainty about the response of the control group, and being dominated at high doses by sampling variability). Strengths and limitations of the Data Cube approach are summarized, and potential decision-analytic applications to making better informed risk management decisions are briefly discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号