期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Effects of branch length uncertainty on Bayesian posterior probabilities for phylogenetic hypotheses 总被引：1，自引：0，他引：1

Kolaczkowski B Thornton JW 《Molecular biology and evolution》2007,24(9):2108-2118

In Bayesian phylogenetics, confidence in evolutionary relationships is expressed as posterior probability--the probability that a tree or clade is true given the data, evolutionary model, and prior assumptions about model parameters. Model parameters, such as branch lengths, are never known in advance; Bayesian methods incorporate this uncertainty by integrating over a range of plausible values given an assumed prior probability distribution for each parameter. Little is known about the effects of integrating over branch length uncertainty on posterior probabilities when different priors are assumed. Here, we show that integrating over uncertainty using a wide range of typical prior assumptions strongly affects posterior probabilities, causing them to deviate from those that would be inferred if branch lengths were known in advance; only when there is no uncertainty to integrate over does the average posterior probability of a group of trees accurately predict the proportion of correct trees in the group. The pattern of branch lengths on the true tree determines whether integrating over uncertainty pushes posterior probabilities upward or downward. The magnitude of the effect depends on the specific prior distributions used and the length of the sequences analyzed. Under realistic conditions, however, even extraordinarily long sequences are not enough to prevent frequent inference of incorrect clades with strong support. We found that across a range of conditions, diffuse priors--either flat or exponential distributions with moderate to large means--provide more reliable inferences than small-mean exponential priors. An empirical Bayes approach that fixes branch lengths at their maximum likelihood estimates yields posterior probabilities that more closely match those that would be inferred if the true branch lengths were known in advance and reduces the rate of strongly supported false inferences compared with fully Bayesian integration. 相似文献

2.

Model parameterization, prior distributions, and the general time-reversible model in Bayesian phylogenetics

Zwickl D Holder M 《Systematic biology》2004,53(6):877-888

Bayesian phylogenetic methods require the selection of prior probability distributions for all parameters of the model of evolution. These distributions allow one to incorporate prior information into a Bayesian analysis, but even in the absence of meaningful prior information, a prior distribution must be chosen. In such situations, researchers typically seek to choose a prior that will have little effect on the posterior estimates produced by an analysis, allowing the data to dominate. Sometimes a prior that is uniform (assigning equal prior probability density to all points within some range) is chosen for this purpose. In reality, the appropriate prior depends on the parameterization chosen for the model of evolution, a choice that is largely arbitrary. There is an extensive Bayesian literature on appropriate prior choice, and it has long been appreciated that there are parameterizations for which uniform priors can have a strong influence on posterior estimates. We here discuss the relationship between model parameterization and prior specification, using the general time-reversible model of nucleotide evolution as an example. We present Bayesian analyses of 10 simulated data sets obtained using a variety of prior distributions and parameterizations of the general time-reversible model. Uniform priors can produce biased parameter estimates under realistic conditions, and a variety of alternative priors avoid this bias. 相似文献

3.

Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models 总被引：1，自引：0，他引：1

Huelsenbeck J Rannala B 《Systematic biology》2004,53(6):904-913

What does the posterior probability of a phylogenetic tree mean?This simulation study shows that Bayesian posterior probabilities have the meaning that is typically ascribed to them; the posterior probability of a tree is the probability that the tree is correct, assuming that the model is correct. At the same time, the Bayesian method can be sensitive to model misspecification, and the sensitivity of the Bayesian method appears to be greater than the sensitivity of the nonparametric bootstrap method (using maximum likelihood to estimate trees). Although the estimates of phylogeny obtained by use of the method of maximum likelihood or the Bayesian method are likely to be similar, the assessment of the uncertainty of inferred trees via either bootstrapping (for maximum likelihood estimates) or posterior probabilities (for Bayesian estimates) is not likely to be the same. We suggest that the Bayesian method be implemented with the most complex models of those currently available, as this should reduce the chance that the method will concentrate too much probability on too few trees. 相似文献

4.

Elevated substitution rate estimates from ancient DNA: model violation and bias of Bayesian methods

MIGUEL NAVASCUÉS BRENT C. EMERSON† 《Molecular ecology》2009,18(21):4390-4397

The increasing ability to extract and sequence DNA from noncontemporaneous tissue offers biologists the opportunity to analyse ancient DNA (aDNA) together with modern DNA (mDNA) to address the taxonomy of extinct species, evolutionary origins, historical phylogeography and biogeography. Perhaps more exciting are recent developments in coalescence-based Bayesian inference that offer the potential to use temporal information from aDNA and mDNA for the estimation of substitution rates and divergence dates as an alternative to fossil and geological calibration. This comes at a time of growing interest in the possibility of time dependency for molecular rate estimates. In this study, we provide a critical assessment of Bayesian Markov chain Monte Carlo (MCMC) analysis for the estimation of substitution rate using simulated samples of aDNA and mDNA. We conclude that the current models and priors employed in Bayesian MCMC analysis of heterochronous mtDNA are susceptible to an upward bias in the estimation of substitution rates because of model misspecification when the data come from populations with less than simple demographic histories, including sudden short-lived population bottlenecks or pronounced population structure. However, when model misspecification is only mild, then the 95% highest posterior density intervals provide adequate frequentist coverage of the true rates. 相似文献

5.

An examination of the monophyly of morning glory taxa using Bayesian phylogenetic inference

Miller RE Buckley TR Manos PS 《Systematic biology》2002,51(5):740-753

The objective of this study was to obtain a quantitative assessment of the monophyly of morning glory taxa, specifically the genus Ipomoea and the tribe Argyreieae. Previous systematic studies of morning glories intimated the paraphyly of Ipomoea by suggesting that the genera within the tribe Argyreieae are derived from within Ipomoea; however, no quantitative estimates of statistical support were developed to address these questions. We applied a Bayesian analysis to provide quantitative estimates of monophyly in an investigation of morning glory relationships using DNA sequence data. We also explored various approaches for examining convergence of the Markov chain Monte Carlo (MCMC) simulation of the Bayesian analysis by running 18 separate analyses varying in length. We found convergence of the important components of the phylogenetic model (the tree with the maximum posterior probability, branch lengths, the parameter values from the DNA substitution model, and the posterior probabilities for clade support) for these data after one million generations of the MCMC simulations. In the process, we identified a run where the parameter values obtained were often outside the range of values obtained from the other runs, suggesting an aberrant result. In addition, we compared the Bayesian method of phylogenetic analysis to maximum likelihood and maximum parsimony. The results from the Bayesian analysis and the maximum likelihood analysis were similar for topology, branch lengths, and parameters of the DNA substitution model. Topologies also were similar in the comparison between the Bayesian analysis and maximum parsimony, although the posterior probabilities and the bootstrap proportions exhibited some striking differences. In a Bayesian analysis of three data sets (ITS sequences, waxy sequences, and ITS + waxy sequences) no supoort for the monophyly of the genus Ipomoea, or for the tribe Argyreieae, was observed, with the estimate of the probability of the monophyly of these taxa being less than 3.4 x 10(-7). 相似文献

6.

A Bayesian multinomial logistic exposure model for estimating probabilities of competing sources of nest failure

下载免费PDF全文

Abigail J. Darrah Jonathan B. Cohen Paul M. Castelli 《Ibis》2018,160(1):23-35

Understanding causes of nest loss is critical for the management of endangered bird populations. Available methods for estimating nest loss probabilities to competing sources do not allow for random effects and covariation among sources, and there are few data simulation methods or goodness‐of‐fit (GOF) tests for such models. We developed a Bayesian multinomial extension of the widely used logistic exposure (LE) nest survival model which can incorporate multiple random effects and fixed‐effect covariates for each nest loss category. We investigated the performance of this model and the accompanying GOF test by analysing simulated nest fate datasets with and without age‐biased discovery probability, and by comparing the estimates with those of traditional fixed‐effects estimators. We then exemplify the use of the multinomial LE model and GOF test by analysing Piping Plover Charadrius melodus nest fate data (n = 443) to explore the effects of wire cages (exclosures) constructed around nests, which are used to protect nests from predation but can lead to increased nest abandonment rates. Mean parameter estimates of the random‐effects multinomial LE model were all within 1 sd of the true values used to simulate the datasets. Age‐biased discovery probability did not result in biased parameter estimates. Traditional fixed‐effects models provided estimates with a high bias of up to 43% with a mean of 71% smaller standard deviations. The GOF test identified models that were a poor fit to the simulated data. For the Piping Plover dataset, the fixed‐effects model was less well‐supported than the random‐effects model and underestimated the risk of exclosure use by 16%. The random‐effects model estimated a range of 1–6% probability of abandonment for nests not protected by exclosures across sites and 5–41% probability of abandonment for nests with exclosures, suggesting that the magnitude of exclosure‐related abandonment is site‐specific. Our results demonstrate that unmodelled heterogeneity can result in biased estimates potentially leading to incorrect management recommendations. The Bayesian multinomial LE model offers a flexible method of incorporating random effects into an analysis of nest failure and is robust to age‐biased nest discovery probability. This model can be generalized to other staggered‐entry, time‐to‐hazard situations. 相似文献

7.

Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics

Bryan Kolaczkowski Joseph W. Thornton 《PloS one》2009,4(12)

Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML''s desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI''s long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias—which is apparent under both controlled simulation conditions and in analyses of empirical sequence data—also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI''s bias is caused by one of the method''s stated advantages—that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis. 相似文献

8.

Branch-length prior influences Bayesian posterior probability of phylogeny

Yang Z Rannala B 《Systematic biology》2005,54(3):455-470

The Bayesian method for estimating species phylogenies from molecular sequence data provides an attractive alternative to maximum likelihood with nonparametric bootstrap due to the easy interpretation of posterior probabilities for trees and to availability of efficient computational algorithms. However, for many data sets it produces extremely high posterior probabilities, sometimes for apparently incorrect clades. Here we use both computer simulation and empirical data analysis to examine the effect of the prior model for internal branch lengths. We found that posterior probabilities for trees and clades are sensitive to the prior for internal branch lengths, and priors assuming long internal branches cause high posterior probabilities for trees. In particular, uniform priors with high upper bounds bias Bayesian clade probabilities in favor of extreme values. We discuss possible remedies to the problem, including empirical and full Bayesian methods and subjective procedures suggested in Bayesian hypothesis testing. Our results also suggest that the bootstrap proportion and Bayesian posterior probability are different measures of accuracy, and that the bootstrap proportion, if interpreted as the probability that the clade is true, can be either too liberal or too conservative. 相似文献

9.

Open capture-recapture models with heterogeneity: I. Cormack-Jolly-Seber model

Pledger S Pollock KH Norris JL 《Biometrics》2003,59(4):786-794

In open population capture-recapture studies, it is usually assumed that similar animals (e.g., of the same sex and age group) have similar survival rates and capture probabilities. These assumptions are generally perceived to be an oversimplification, and they can lead to incorrect model selection and biased parameter estimates. Allowing for individual variability in survival and capture probabilities among apparently similar animals is now becoming possible, due to advances in closed population models and improved computing power. This article presents a flexible framework of likelihood-based models which allow for individual heterogeneity in survival and capture rates. Heterogeneity is modeled using finite mixtures, which have enough flexibility of distribution shape to accommodate a wide variety of different patterns of individual variation. The models condition on the first capture of each animal, and include as a special case the Cormack-Jolly-Seber model. Model selection is done either using Akaike's information criterion or by likelihood ratio tests, making available checks of different influences on survival rates. Bias in parameter estimates is reduced by including individual heterogeneity. Model selection and bias reduction are important in population studies and for making informed management decisions. 相似文献

10.

Empirical Bayes interval estimates that are conditionally equal to unadjusted confidence intervals or to default prior credibility intervals

Bickel DR 《Statistical applications in genetics and molecular biology》2012,11(3):Article 7

Problems involving thousands of null hypotheses have been addressed by estimating the local false discovery rate (LFDR). A previous LFDR approach to reporting point and interval estimates of an effect-size parameter uses an estimate of the prior distribution of the parameter conditional on the alternative hypothesis. That estimated prior is often unreliable, and yet strongly influences the posterior intervals and point estimates, causing the posterior intervals to differ from fixed-parameter confidence intervals, even for arbitrarily small estimates of the LFDR. That influence of the estimated prior manifests the failure of the conditional posterior intervals, given the truth of the alternative hypothesis, to match the confidence intervals. Those problems are overcome by changing the posterior distribution conditional on the alternative hypothesis from a Bayesian posterior to a confidence posterior. Unlike the Bayesian posterior, the confidence posterior equates the posterior probability that the parameter lies in a fixed interval with the coverage rate of the coinciding confidence interval. The resulting confidence-Bayes hybrid posterior supplies interval and point estimates that shrink toward the null hypothesis value. The confidence intervals tend to be much shorter than their fixed-parameter counterparts, as illustrated with gene expression data. Simulations nonetheless confirm that the shrunken confidence intervals cover the parameter more frequently than stated. Generally applicable sufficient conditions for correct coverage are given. In addition to having those frequentist properties, the hybrid posterior can also be motivated from an objective Bayesian perspective by requiring coherence with some default prior conditional on the alternative hypothesis. That requirement generates a new class of approximate posteriors that supplement Bayes factors modified for improper priors and that dampen the influence of proper priors on the credibility intervals. While that class of posteriors intersects the class of confidence-Bayes posteriors, neither class is a subset of the other. In short, two first principles generate both classes of posteriors: a coherence principle and a relevance principle. The coherence principle requires that all effect size estimates comply with the same probability distribution. The relevance principle means effect size estimates given the truth of an alternative hypothesis cannot depend on whether that truth was known prior to observing the data or whether it was learned from the data. 相似文献

11.

Population genetics of polymorphism and divergence for diploid selection models with arbitrary dominance

Williamson S Fledel-Alon A Bustamante CD 《Genetics》2004,168(1):463-475

We develop a Poisson random-field model of polymorphism and divergence that allows arbitrary dominance relations in a diploid context. This model provides a maximum-likelihood framework for estimating both selection and dominance parameters of new mutations using information on the frequency spectrum of sequence polymorphisms. This is the first DNA sequence-based estimator of the dominance parameter. Our model also leads to a likelihood-ratio test for distinguishing nongenic from genic selection; simulations indicate that this test is quite powerful when a large number of segregating sites are available. We also use simulations to explore the bias in selection parameter estimates caused by unacknowledged dominance relations. When inference is based on the frequency spectrum of polymorphisms, genic selection estimates of the selection parameter can be very strongly biased even for minor deviations from the genic selection model. Surprisingly, however, when inference is based on polymorphism and divergence (McDonald-Kreitman) data, genic selection estimates of the selection parameter are nearly unbiased, even for completely dominant or recessive mutations. Further, we find that weak overdominant selection can increase, rather than decrease, the substitution rate relative to levels of polymorphism. This nonintuitive result has major implications for the interpretation of several popular tests of neutrality. 相似文献

12.

Robustness of compound Dirichlet priors for Bayesian inference of branch lengths

Zhang C Rannala B Yang Z 《Systematic biology》2012,61(5):779-784

We modified the phylogenetic program MrBayes 3.1.2 to incorporate the compound Dirichlet priors for branch lengths proposed recently by Rannala, Zhu, and Yang (2012. Tail paradox, partial identifiability and influential priors in Bayesian branch length inference. Mol. Biol. Evol. 29:325-335.) as a solution to the problem of branch-length overestimation in Bayesian phylogenetic inference. The compound Dirichlet prior specifies a fairly diffuse prior on the tree length (the sum of branch lengths) and uses a Dirichlet distribution to partition the tree length into branch lengths. Six problematic data sets originally analyzed by Brown, Hedtke, Lemmon, and Lemmon (2010. When trees grow too long: investigating the causes of highly inaccurate Bayesian branch-length estimates. Syst. Biol. 59:145-161) are reanalyzed using the modified version of MrBayes to investigate properties of Bayesian branch-length estimation using the new priors. While the default exponential priors for branch lengths produced extremely long trees, the compound Dirichlet priors produced posterior estimates that are much closer to the maximum likelihood estimates. Furthermore, the posterior tree lengths were quite robust to changes in the parameter values in the compound Dirichlet priors, for example, when the prior mean of tree length changed over several orders of magnitude. Our results suggest that the compound Dirichlet priors may be useful for correcting branch-length overestimation in phylogenetic analyses of empirical data sets. 相似文献

13.

Uncertainties on lung doses from inhaled plutonium

Puncher M Birchall A Bull RK 《Radiation research》2011,176(4):494-507

In a recent epidemiological study, Bayesian uncertainties on lung doses have been calculated to determine lung cancer risk from occupational exposures to plutonium. These calculations used a revised version of the Human Respiratory Tract Model (HRTM) published by the ICRP. In addition to the Bayesian analyses, which give probability distributions of doses, point estimates of doses (single estimates without uncertainty) were also provided for that study using the existing HRTM as it is described in ICRP Publication 66; these are to be used in a preliminary analysis of risk. To infer the differences between the point estimates and Bayesian uncertainty analyses, this paper applies the methodology to former workers of the United Kingdom Atomic Energy Authority (UKAEA), who constituted a subset of the study cohort. The resulting probability distributions of lung doses are compared with the point estimates obtained for each worker. It is shown that mean posterior lung doses are around two- to fourfold higher than point estimates and that uncertainties on doses vary over a wide range, greater than two orders of magnitude for some lung tissues. In addition, we demonstrate that uncertainties on the parameter values, rather than the model structure, are largely responsible for these effects. Of these it appears to be the parameters describing absorption from the lungs to blood that have the greatest impact on estimates of lung doses from urine bioassay. Therefore, accurate determination of the chemical form of inhaled plutonium and the absorption parameter values for these materials is important for obtaining reliable estimates of lung doses and hence risk from occupational exposures to plutonium. 相似文献

14.

Phylogenetic analysis using parsimony and likelihood methods 总被引：1，自引：0，他引：1

Ziheng Yang 《Journal of molecular evolution》1996,42(2):294-307

The assumptions underlying the maximum-parsimony (MP) method of phylogenetic tree reconstruction were intuitively examined by studying the way the method works. Computer simulations were performed to corroborate the intuitive examination. Parsimony appears to involve very stringent assumptions concerning the process of sequence evolution, such as constancy of substitution rates between nucleotides, constancy of rates across nucleotide sites, and equal branch lengths in the tree. For practical data analysis, the requirement of equal branch lengths means similar substitution rates among lineages (the existence of an approximate molecular clock), relatively long interior branches, and also few species in the data. However, a small amount of evolution is neither a necessary nor a sufficient requirement of the method. The difficulties involved in the application of current statistical estimation theory to tree reconstruction were discussed, and it was suggested that the approach proposed by Felsenstein (1981,J. Mol. Evol. 17: 368–376) for topology estimation, as well as its many variations and extensions, differs fundamentally from the maximum likelihood estimation of a conventional statistical parameter. Evidence was presented showing that the Felsenstein approach does not share the asymptotic efficiency of the maximum likelihood estimator of a statistical parameter. Computer simulations were performed to study the probability that MP recovers the true tree under a hierarchy of models of nucleotide substitution; its performance relative to the likelihood method was especially noted. The results appeared to support the intuitive examination of the assumptions underlying MP. When a simple model of nucleotide substitution was assumed to generate data, the probability that MP recovers the true topology could be as high as, or even higher than, that for the likelihood method. When the assumed model became more complex and realistic, e.g., when substitution rates were allowed to differ between nucleotides or across sites, the probability that MP recovers the true topology, and especially its performance relative to that of the likelihood method, generally deteriorates. As the complexity of the process of nucleotide substitution in real sequences is well recognized, the likelihood method appears preferable to parsimony. However, the development of a statistical methodology for the efficient estimation of the tree topology remains a difficult open problem. 相似文献

15.

Tail paradox, partial identifiability, and influential priors in Bayesian branch length inference

Rannala B Zhu T Yang Z 《Molecular biology and evolution》2012,29(1):325-335

Recent studies have observed that Bayesian analyses of sequence data sets using the program MrBayes sometimes generate extremely large branch lengths, with posterior credibility intervals for the tree length (sum of branch lengths) excluding the maximum likelihood estimates. Suggested explanations for this phenomenon include the existence of multiple local peaks in the posterior, lack of convergence of the chain in the tail of the posterior, mixing problems, and misspecified priors on branch lengths. Here, we analyze the behavior of Bayesian Markov chain Monte Carlo algorithms when the chain is in the tail of the posterior distribution and note that all these phenomena can occur. In Bayesian phylogenetics, the likelihood function approaches a constant instead of zero when the branch lengths increase to infinity. The flat tail of the likelihood can cause poor mixing and undue influence of the prior. We suggest that the main cause of the extreme branch length estimates produced in many Bayesian analyses is the poor choice of a default prior on branch lengths in current Bayesian phylogenetic programs. The default prior in MrBayes assigns independent and identical distributions to branch lengths, imposing strong (and unreasonable) assumptions about the tree length. The problem is exacerbated by the strong correlation between the branch lengths and parameters in models of variable rates among sites or among site partitions. To resolve the problem, we suggest two multivariate priors for the branch lengths (called compound Dirichlet priors) that are fairly diffuse and demonstrate their utility in the special case of branch length estimation on a star phylogeny. Our analysis highlights the need for careful thought in the specification of high-dimensional priors in Bayesian analyses. 相似文献

16.

Using Inverse Probability Bootstrap Sampling to Eliminate Sample Induced Bias in Model Based Analysis of Unequal Probability Samples

Matthew Nahorniak David P. Larsen Carol Volk Chris E. Jordan 《PloS one》2015,10(6)

In ecology, as in other research fields, efficient sampling for population estimation often drives sample designs toward unequal probability sampling, such as in stratified sampling. Design based statistical analysis tools are appropriate for seamless integration of sample design into the statistical analysis. However, it is also common and necessary, after a sampling design has been implemented, to use datasets to address questions that, in many cases, were not considered during the sampling design phase. Questions may arise requiring the use of model based statistical tools such as multiple regression, quantile regression, or regression tree analysis. However, such model based tools may require, for ensuring unbiased estimation, data from simple random samples, which can be problematic when analyzing data from unequal probability designs. Despite numerous method specific tools available to properly account for sampling design, too often in the analysis of ecological data, sample design is ignored and consequences are not properly considered. We demonstrate here that violation of this assumption can lead to biased parameter estimates in ecological research. In addition, to the set of tools available for researchers to properly account for sampling design in model based analysis, we introduce inverse probability bootstrapping (IPB). Inverse probability bootstrapping is an easily implemented method for obtaining equal probability re-samples from a probability sample, from which unbiased model based estimates can be made. We demonstrate the potential for bias in model-based analyses that ignore sample inclusion probabilities, and the effectiveness of IPB sampling in eliminating this bias, using both simulated and actual ecological data. For illustration, we considered three model based analysis tools—linear regression, quantile regression, and boosted regression tree analysis. In all models, using both simulated and actual ecological data, we found inferences to be biased, sometimes severely, when sample inclusion probabilities were ignored, while IPB sampling effectively produced unbiased parameter estimates. 相似文献

17.

Bayesian inference in ecology 总被引：14，自引：1，他引：13

Aaron M. Ellison 《Ecology letters》2004,7(6):509-520

Bayesian inference is an important statistical tool that is increasingly being used by ecologists. In a Bayesian analysis, information available before a study is conducted is summarized in a quantitative model or hypothesis: the prior probability distribution. Bayes’ Theorem uses the prior probability distribution and the likelihood of the data to generate a posterior probability distribution. Posterior probability distributions are an epistemological alternative to P‐values and provide a direct measure of the degree of belief that can be placed on models, hypotheses, or parameter estimates. Moreover, Bayesian information‐theoretic methods provide robust measures of the probability of alternative models, and multiple models can be averaged into a single model that reflects uncertainty in model construction and selection. These methods are demonstrated through a simple worked example. Ecologists are using Bayesian inference in studies that range from predicting single‐species population dynamics to understanding ecosystem processes. Not all ecologists, however, appreciate the philosophical underpinnings of Bayesian inference. In particular, Bayesians and frequentists differ in their definition of probability and in their treatment of model parameters as random variables or estimates of true values. These assumptions must be addressed explicitly before deciding whether or not to use Bayesian methods to analyse ecological data. 相似文献

18.

Comparative performance of Bayesian and AIC-based measures of phylogenetic model uncertainty

Alfaro ME Huelsenbeck JP 《Systematic biology》2006,55(1):89-96

Reversible-jump Markov chain Monte Carlo (RJ-MCMC) is a technique for simultaneously evaluating multiple related (but not necessarily nested) statistical models that has recently been applied to the problem of phylogenetic model selection. Here we use a simulation approach to assess the performance of this method and compare it to Akaike weights, a measure of model uncertainty that is based on the Akaike information criterion. Under conditions where the assumptions of the candidate models matched the generating conditions, both Bayesian and AIC-based methods perform well. The 95% credible interval contained the generating model close to 95% of the time. However, the size of the credible interval differed with the Bayesian credible set containing approximately 25% to 50% fewer models than an AIC-based credible interval. The posterior probability was a better indicator of the correct model than the Akaike weight when all assumptions were met but both measures performed similarly when some model assumptions were violated. Models in the Bayesian posterior distribution were also more similar to the generating model in their number of parameters and were less biased in their complexity. In contrast, Akaike-weighted models were more distant from the generating model and biased towards slightly greater complexity. The AIC-based credible interval appeared to be more robust to the violation of the rate homogeneity assumption. Both AIC and Bayesian approaches suggest that substantial uncertainty can accompany the choice of model for phylogenetic analyses, suggesting that alternative candidate models should be examined in analysis of phylogenetic data. [AIC; Akaike weights; Bayesian phylogenetics; model averaging; model selection; model uncertainty; posterior probability; reversible jump.]. 相似文献

19.

An exploration of fixed and random effects selection for longitudinal binary outcomes in the presence of nonignorable dropout

Ning Li Michael J. Daniels Gang Li Robert M. Elashoff 《Biometrical journal. Biometrische Zeitschrift》2013,55(1):17-37

We explore a Bayesian approach to selection of variables that represent fixed and random effects in modeling of longitudinal binary outcomes with missing data caused by dropouts. We show via analytic results for a simple example that nonignorable missing data lead to biased parameter estimates. This bias results in selection of wrong effects asymptotically, which we can confirm via simulations for more complex settings. By jointly modeling the longitudinal binary data with the dropout process that possibly leads to nonignorable missing data, we are able to correct the bias in estimation and selection. Mixture priors with a point mass at zero are used to facilitate variable selection. We illustrate the proposed approach using a clinical trial for acute ischemic stroke. 相似文献

20.

A Bayesian approach to the multiplicity problem for significance testing with binomial data 总被引：1，自引：0，他引：1

C Y Meng A P Dempster 《Biometrics》1987,43(2):301-311

Statistical analyses of simple tumor rates from an animal experiment with one control and one treated group typically consist of hypothesis testing of many 2 X 2 tables, one for each tumor type or site. The multiplicity of significance tests may cause excessive overall false-positive rates. This paper presents a Bayesian approach to the problem of multiple significance testing. We develop a normal logistic model that accommodates the incidences of all tumor types or sites observed in the current experiment simultaneously as well as their historical control incidences. Exchangeable normal priors are assumed for certain linear terms in the model. Posterior means, standard deviations, and Bayesian P-values are computed for an average treatment effect as well as for the effects on individual tumor types or sites. Model assumptions are checked using probability plots and the sensitivity of the parameter estimates to alternative priors is studied. The method is illustrated using tumor data from a chronic animal experiment. 相似文献