首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
During the 20th century ecologists largely relied on the frequentist system of inference for the analysis of their data. However, in the past few decades ecologists have become increasingly interested in the use of Bayesian methods of data analysis. In this article I provide guidance to ecologists who would like to decide whether Bayesian methods can be used to improve their conclusions and predictions. I begin by providing a concise summary of Bayesian methods of analysis, including a comparison of differences between Bayesian and frequentist approaches to inference when using hierarchical models. Next I provide a list of problems where Bayesian methods of analysis may arguably be preferred over frequentist methods. These problems are usually encountered in analyses based on hierarchical models of data. I describe the essentials required for applying modern methods of Bayesian computation, and I use real-world examples to illustrate these methods. I conclude by summarizing what I perceive to be the main strengths and weaknesses of using Bayesian methods to solve ecological inference problems.  相似文献   

2.
Establishing that a set of population‐splitting events occurred at the same time can be a potentially persuasive argument that a common process affected the populations. Recently, Oaks et al. ( 2013 ) assessed the ability of an approximate‐Bayesian model‐choice method (msBayes ) to estimate such a pattern of simultaneous divergence across taxa, to which Hickerson et al. ( 2014 ) responded. Both papers agree that the primary inference enabled by the method is very sensitive to prior assumptions and often erroneously supports shared divergences across taxa when prior uncertainty about divergence times is represented by a uniform distribution. However, the papers differ about the best explanation and solution for this problem. Oaks et al. ( 2013 ) suggested the method's behavior was caused by the strong weight of uniformly distributed priors on divergence times leading to smaller marginal likelihoods (and thus smaller posterior probabilities) of models with more divergence‐time parameters (Hypothesis 1); they proposed alternative prior probability distributions to avoid such strongly weighted posteriors. Hickerson et al. ( 2014 ) suggested numerical‐approximation error causes msBayes analyses to be biased toward models of clustered divergences because the method's rejection algorithm is unable to adequately sample the parameter space of richer models within reasonable computational limits when using broad uniform priors on divergence times (Hypothesis 2). As a potential solution, they proposed a model‐averaging approach that uses narrow, empirically informed uniform priors. Here, we use analyses of simulated and empirical data to demonstrate that the approach of Hickerson et al. ( 2014 ) does not mitigate the method's tendency to erroneously support models of highly clustered divergences, and is dangerous in the sense that the empirically derived uniform priors often exclude from consideration the true values of the divergence‐time parameters. Our results also show that the tendency of msBayes analyses to support models of shared divergences is primarily due to Hypothesis 1, whereas Hypothesis 2 is an untenable explanation for the bias. Overall, this series of papers demonstrates that if our prior assumptions place too much weight in unlikely regions of parameter space such that the exact posterior supports the wrong model of evolutionary history, no amount of computation can rescue our inference. Fortunately, as predicted by fundamental principles of Bayesian model choice, more flexible distributions that accommodate prior uncertainty about parameters without placing excessive weight in vast regions of parameter space with low likelihood increase the method's robustness and power to detect temporal variation in divergences.  相似文献   

3.
Controlling for imperfect detection is important for developing species distribution models (SDMs). Occupancy‐detection models based on the time needed to detect a species can be used to address this problem, but this is hindered when times to detection are not known precisely. Here, we extend the time‐to‐detection model to deal with detections recorded in time intervals and illustrate the method using a case study on stream fish distribution modeling. We collected electrofishing samples of six fish species across a Mediterranean watershed in Northeast Portugal. Based on a Bayesian hierarchical framework, we modeled the probability of water presence in stream channels, and the probability of species occupancy conditional on water presence, in relation to environmental and spatial variables. We also modeled time‐to‐first detection conditional on occupancy in relation to local factors, using modified interval‐censored exponential survival models. Posterior distributions of occupancy probabilities derived from the models were used to produce species distribution maps. Simulations indicated that the modified time‐to‐detection model provided unbiased parameter estimates despite interval‐censoring. There was a tendency for spatial variation in detection rates to be primarily influenced by depth and, to a lesser extent, stream width. Species occupancies were consistently affected by stream order, elevation, and annual precipitation. Bayesian P‐values and AUCs indicated that all models had adequate fit and high discrimination ability, respectively. Mapping of predicted occupancy probabilities showed widespread distribution by most species, but uncertainty was generally higher in tributaries and upper reaches. The interval‐censored time‐to‐detection model provides a practical solution to model occupancy‐detection when detections are recorded in time intervals. This modeling framework is useful for developing SDMs while controlling for variation in detection rates, as it uses simple data that can be readily collected by field ecologists.  相似文献   

4.
Bayesian phylogenetic methods require the selection of prior probability distributions for all parameters of the model of evolution. These distributions allow one to incorporate prior information into a Bayesian analysis, but even in the absence of meaningful prior information, a prior distribution must be chosen. In such situations, researchers typically seek to choose a prior that will have little effect on the posterior estimates produced by an analysis, allowing the data to dominate. Sometimes a prior that is uniform (assigning equal prior probability density to all points within some range) is chosen for this purpose. In reality, the appropriate prior depends on the parameterization chosen for the model of evolution, a choice that is largely arbitrary. There is an extensive Bayesian literature on appropriate prior choice, and it has long been appreciated that there are parameterizations for which uniform priors can have a strong influence on posterior estimates. We here discuss the relationship between model parameterization and prior specification, using the general time-reversible model of nucleotide evolution as an example. We present Bayesian analyses of 10 simulated data sets obtained using a variety of prior distributions and parameterizations of the general time-reversible model. Uniform priors can produce biased parameter estimates under realistic conditions, and a variety of alternative priors avoid this bias.  相似文献   

5.
In Bayesian phylogenetics, confidence in evolutionary relationships is expressed as posterior probability--the probability that a tree or clade is true given the data, evolutionary model, and prior assumptions about model parameters. Model parameters, such as branch lengths, are never known in advance; Bayesian methods incorporate this uncertainty by integrating over a range of plausible values given an assumed prior probability distribution for each parameter. Little is known about the effects of integrating over branch length uncertainty on posterior probabilities when different priors are assumed. Here, we show that integrating over uncertainty using a wide range of typical prior assumptions strongly affects posterior probabilities, causing them to deviate from those that would be inferred if branch lengths were known in advance; only when there is no uncertainty to integrate over does the average posterior probability of a group of trees accurately predict the proportion of correct trees in the group. The pattern of branch lengths on the true tree determines whether integrating over uncertainty pushes posterior probabilities upward or downward. The magnitude of the effect depends on the specific prior distributions used and the length of the sequences analyzed. Under realistic conditions, however, even extraordinarily long sequences are not enough to prevent frequent inference of incorrect clades with strong support. We found that across a range of conditions, diffuse priors--either flat or exponential distributions with moderate to large means--provide more reliable inferences than small-mean exponential priors. An empirical Bayes approach that fixes branch lengths at their maximum likelihood estimates yields posterior probabilities that more closely match those that would be inferred if the true branch lengths were known in advance and reduces the rate of strongly supported false inferences compared with fully Bayesian integration.  相似文献   

6.
Computational modeling is being used increasingly in neuroscience. In deriving such models, inference issues such as model selection, model complexity, and model comparison must be addressed constantly. In this article we present briefly the Bayesian approach to inference. Under a simple set of commonsense axioms, there exists essentially a unique way of reasoning under uncertainty by assigning a degree of confidence to any hypothesis or model, given the available data and prior information. Such degrees of confidence must obey all the rules governing probabilities and can be updated accordingly as more data becomes available. While the Bayesian methodology can be applied to any type of model, as an example we outline its use for an important, and increasingly standard, class of models in computational neuroscience—compartmental models of single neurons. Inference issues are particularly relevant for these models: their parameter spaces are typically very large, neurophysiological and neuroanatomical data are still sparse, and probabilistic aspects are often ignored. As a tutorial, we demonstrate the Bayesian approach on a class of one-compartment models with varying numbers of conductances. We then apply Bayesian methods on a compartmental model of a real neuron to determine the optimal amount of noise to add to the model to give it a level of spike time variability comparable to that found in the real cell.  相似文献   

7.
Comparison of the performance and accuracy of different inference methods, such as maximum likelihood (ML) and Bayesian inference, is difficult because the inference methods are implemented in different programs, often written by different authors. Both methods were implemented in the program MIGRATE, that estimates population genetic parameters, such as population sizes and migration rates, using coalescence theory. Both inference methods use the same Markov chain Monte Carlo algorithm and differ from each other in only two aspects: parameter proposal distribution and maximization of the likelihood function. Using simulated datasets, the Bayesian method generally fares better than the ML approach in accuracy and coverage, although for some values the two approaches are equal in performance. MOTIVATION: The Markov chain Monte Carlo-based ML framework can fail on sparse data and can deliver non-conservative support intervals. A Bayesian framework with appropriate prior distribution is able to remedy some of these problems. RESULTS: The program MIGRATE was extended to allow not only for ML(-) maximum likelihood estimation of population genetics parameters but also for using a Bayesian framework. Comparisons between the Bayesian approach and the ML approach are facilitated because both modes estimate the same parameters under the same population model and assumptions.  相似文献   

8.
Two-part joint models for a longitudinal semicontinuous biomarker and a terminal event have been recently introduced based on frequentist estimation. The biomarker distribution is decomposed into a probability of positive value and the expected value among positive values. Shared random effects can represent the association structure between the biomarker and the terminal event. The computational burden increases compared to standard joint models with a single regression model for the biomarker. In this context, the frequentist estimation implemented in the R package frailtypack can be challenging for complex models (i.e., a large number of parameters and dimension of the random effects). As an alternative, we propose a Bayesian estimation of two-part joint models based on the Integrated Nested Laplace Approximation (INLA) algorithm to alleviate the computational burden and fit more complex models. Our simulation studies confirm that INLA provides accurate approximation of posterior estimates and to reduced computation time and variability of estimates compared to frailtypack in the situations considered. We contrast the Bayesian and frequentist approaches in the analysis of two randomized cancer clinical trials (GERCOR and PRIME studies), where INLA has a reduced variability for the association between the biomarker and the risk of event. Moreover, the Bayesian approach was able to characterize subgroups of patients associated with different responses to treatment in the PRIME study. Our study suggests that the Bayesian approach using the INLA algorithm enables to fit complex joint models that might be of interest in a wide range of clinical applications.  相似文献   

9.
Understanding the determinants of species’ distributions and abundances is a central theme in ecology. The development of statistical models to achieve this has a long history and the notion that the model should closely reflect underlying scientific understanding has encouraged ecologists to adopt complex statistical methods as they arise. In this paper we describe a Bayesian hierarchical model that reflects a conceptual ecological model of multi‐scaled environmental determinants of riverine fish species’ distributions and abundances. We illustrate this with distribution and abundance data of a small‐bodied fish species, the Empire gudgeon Hypseleotris galii, in the Mary and Albert Rivers, Queensland, Australia. Specifically, the model sought to address; 1) the extent that landscape‐scale abiotic variables can explain the species’ distribution compared to local‐scale variables, 2) how local‐scale abiotic variables can explain species’ abundances, and 3) how are these local‐scale relationships mediated by landscape‐scale variables. Overall, the model accounted for around 60% of variation in the distribution and abundance of H. galii. The findings show that the landscape‐scale variables explain much of the distribution of the species; however, there was considerable improvement in estimating the species’ distribution with the addition of local‐scale variables. There were many strong relationships between abundance and local‐scale abiotic variables; however, several of these relationships were mediated by some of the landscape‐scale variables. The extent of spatial autocorrelation in the data was relatively low compared to the distances among sampling reaches. Our findings exemplify that Bayesian statistical modelling provides a robust framework for statistical modelling that reflects our ecological understanding. This allows ecologists to address a range of ecological questions with a single unified probability model rather than a series of disconnected analyses.  相似文献   

10.
Salway R  Wakefield J 《Biometrics》2008,64(2):620-626
Summary .   This article considers the modeling of single-dose pharmacokinetic data. Traditionally, so-called compartmental models have been used to analyze such data. Unfortunately, the mean function of such models are sums of exponentials for which inference and computation may not be straightforward. We present an alternative to these models based on generalized linear models, for which desirable statistical properties exist, with a logarithmic link and gamma distribution. The latter has a constant coefficient of variation, which is often appropriate for pharmacokinetic data. Inference is convenient from either a likelihood or a Bayesian perspective. We consider models for both single and multiple individuals, the latter via generalized linear mixed models. For single individuals, Bayesian computation may be carried out with recourse to simulation. We describe a rejection algorithm that, unlike Markov chain Monte Carlo, produces independent samples from the posterior and allows straightforward calculation of Bayes factors for model comparison. We also illustrate how prior distributions may be specified in terms of model-free pharmacokinetic parameters of interest. The methods are applied to data from 12 individuals following administration of the antiasthmatic agent theophylline.  相似文献   

11.
I describe an open‐source R package, multimark , for estimation of survival and abundance from capture–mark–recapture data consisting of multiple “noninvasive” marks. Noninvasive marks include natural pelt or skin patterns, scars, and genetic markers that enable individual identification in lieu of physical capture. multimark provides a means for combining and jointly analyzing encounter histories from multiple noninvasive sources that otherwise cannot be reliably matched (e.g., left‐ and right‐sided photographs of bilaterally asymmetrical individuals). The package is currently capable of fitting open population Cormack–Jolly–Seber (CJS) and closed population abundance models with up to two mark types using Bayesian Markov chain Monte Carlo (MCMC) methods. multimark can also be used for Bayesian analyses of conventional capture–recapture data consisting of a single‐mark type. Some package features include (1) general model specification using formulas already familiar to most R users, (2) ability to include temporal, behavioral, age, cohort, and individual heterogeneity effects in detection and survival probabilities, (3) improved MCMC algorithm that is computationally faster and more efficient than previously proposed methods, (4) Bayesian multimodel inference using reversible jump MCMC, and (5) data simulation capabilities for power analyses and assessing model performance. I demonstrate use of multimark using left‐ and right‐sided encounter histories for bobcats (Lynx rufus) collected from remote single‐camera stations in southern California. In this example, there is evidence of a behavioral effect (i.e., trap “happy” response) that is otherwise indiscernible using conventional single‐sided analyses. The package will be most useful to ecologists seeking stronger inferences by combining different sources of mark–recapture data that are difficult (or impossible) to reliably reconcile, particularly with the sparse datasets typical of rare or elusive species for which noninvasive sampling techniques are most commonly employed. Addressing deficiencies in currently available software, multimark also provides a user‐friendly interface for performing Bayesian multimodel inference using capture–recapture data consisting of a single conventional mark or multiple noninvasive marks.  相似文献   

12.
Aim Conservation practitioners use biological surveys to ascertain whether or not a site is occupied by a particular species. Widely used statistical methods estimate the probability that a species will be detected in a survey of an occupied site. However, these estimates of detection probability are alone not sufficient to calculate the probability that a species is present given that it was not detected. The aim of this paper is to demonstrate methods for correctly calculating (1) the probability a species occupies a site given one or more non‐detections, and (2) the number of sequential non‐detections necessary to assert, with a pre‐specified confidence, that a species is absent from a site. Location Occupancy data for a tree frog in eastern Australia serve to illustrate methods that may be applied anywhere species’ occupancy data are used and detection probabilities are < 1. Methods Building on Bayesian expressions for the probability that a site is occupied by a species when it is not detected, and the number of non‐detections necessary to assert absence with a pre‐specified confidence, we estimate occupancy probabilities across tree frog survey locations, drawing on information about where and when the species was detected during surveys. Results We show that the number of sequential non‐detections necessary to assert that a species is absent increases nonlinearly with the prior probability of occupancy, the probability of detection if present, and the desired level of confidence about absence. Main conclusions If used more widely, the Bayesian analytical approaches illustrated here would improve collection and interpretation of biological survey data, providing a coherent way to incorporate detection probability estimates in the design of minimum survey requirements for monitoring, impact assessment and distribution modelling.  相似文献   

13.
Shared random effects joint models are becoming increasingly popular for investigating the relationship between longitudinal and time‐to‐event data. Although appealing, such complex models are computationally intensive, and quick, approximate methods may provide a reasonable alternative. In this paper, we first compare the shared random effects model with two approximate approaches: a naïve proportional hazards model with time‐dependent covariate and a two‐stage joint model, which uses plug‐in estimates of the fitted values from a longitudinal analysis as covariates in a survival model. We show that the approximate approaches should be avoided since they can severely underestimate any association between the current underlying longitudinal value and the event hazard. We present classical and Bayesian implementations of the shared random effects model and highlight the advantages of the latter for making predictions. We then apply the models described to a study of abdominal aortic aneurysms (AAA) to investigate the association between AAA diameter and the hazard of AAA rupture. Out‐of‐sample predictions of future AAA growth and hazard of rupture are derived from Bayesian posterior predictive distributions, which are easily calculated within an MCMC framework. Finally, using a multivariate survival sub‐model we show that underlying diameter rather than the rate of growth is the most important predictor of AAA rupture.  相似文献   

14.
Open population capture‐recapture models are widely used to estimate population demographics and abundance over time. Bayesian methods exist to incorporate open population modeling with spatial capture‐recapture (SCR), allowing for estimation of the effective area sampled and population density. Here, open population SCR is formulated as a hidden Markov model (HMM), allowing inference by maximum likelihood for both Cormack‐Jolly‐Seber and Jolly‐Seber models, with and without activity center movement. The method is applied to a 12‐year survey of male jaguars (Panthera onca) in the Cockscomb Basin Wildlife Sanctuary, Belize, to estimate survival probability and population abundance over time. For this application, inference is shown to be biased when assuming activity centers are fixed over time, while including a model for activity center movement provides negligible bias and nominal confidence interval coverage, as demonstrated by a simulation study. The HMM approach is compared with Bayesian data augmentation and closed population models for this application. The method is substantially more computationally efficient than the Bayesian approach and provides a lower root‐mean‐square error in predicting population density compared to closed population models.  相似文献   

15.
Model averaging is gaining popularity among ecologists for making inference and predictions. Methods for combining models include Bayesian model averaging (BMA) and Akaike’s Information Criterion (AIC) model averaging. BMA can be implemented with different prior model weights, including the Kullback–Leibler prior associated with AIC model averaging, but it is unclear how the prior model weight affects model results in a predictive context. Here, we implemented BMA using the Bayesian Information Criterion (BIC) approximation to Bayes factors for building predictive models of bird abundance and occurrence in the Chihuahuan Desert of New Mexico. We examined how model predictive ability differed across four prior model weights, and how averaged coefficient estimates, standard errors and coefficients’ posterior probabilities varied for 16 bird species. We also compared the predictive ability of BMA models to a best single-model approach. Overall, Occam’s prior of parsimony provided the best predictive models. In general, the Kullback–Leibler prior, however, favored complex models of lower predictive ability. BMA performed better than a best single-model approach independently of the prior model weight for 6 out of 16 species. For 6 other species, the choice of the prior model weight affected whether BMA was better than the best single-model approach. Our results demonstrate that parsimonious priors may be favorable over priors that favor complexity for making predictions. The approach we present has direct applications in ecology for better predicting patterns of species’ abundance and occurrence.  相似文献   

16.
Summary A Bayesian method was developed for identifying genetic markers linked to quantitative trait loci (QTL) by analyzing data from daughter or granddaughter designs and single markers or marker pairs. Traditional methods may yield unrealistic results because linkage tests depend on number of markers and QTL gene effects associated with selected markers are overestimated. The Bayesian or posterior probability of linkage combines information from a daughter or granddaughter design with the prior probability of linkage between a marker locus and a QTL. If the posterior probability exceeds a certain quantity, linkage is declared. Upon linkage acceptance, Bayesian estimates of marker-QTL recombination rate and QTL gene effects and frequencies are obtained. The Bayesian estimates of QTL gene effects account for different amounts of information by shrinking information from data toward the mean or mode of a prior exponential distribution of gene effects. Computation of the Bayesian analysis is feasible. Exact results are given for biallelic QTL, and extensions to multiallelic QTL are suggested.  相似文献   

17.
This paper uses the analysis of a data set to examine a number of issues in Bayesian statistics and the application of MCMC methods. The data concern the selectivity of fishing nets and logistic regression is used to relate the size of a fish to the probability it will be retained or escape from a trawl net. Hierarchical models relate information from different trawls and posterior distributions are determined using MCMC. Centring data is shown to radically reduce autocorrelation in chains and Rao‐Blackwellisation and chain‐thinning are found to have little effect on parameter estimates. The results of four convergence diagnostics are compared and the sensitivity of the posterior distribution to the prior distribution is examined using a novel method. Nested models are fitted to the data and compared using intrinsic Bayes factors, pseudo‐Bayes factors and credible intervals.  相似文献   

18.
Inferring the demographic history of species and their populations is crucial to understand their contemporary distribution, abundance and adaptations. The high computational overhead of likelihood‐based inference approaches severely restricts their applicability to large data sets or complex models. In response to these restrictions, approximate Bayesian computation (ABC) methods have been developed to infer the demographic past of populations and species. Here, we present the results of an evaluation of the ABC‐based approach implemented in the popular software package diyabc using simulated data sets (mitochondrial DNA sequences, microsatellite genotypes and single nucleotide polymorphisms). We simulated population genetic data under five different simple, single‐population models to assess the model recovery rates as well as the bias and error of the parameter estimates. The ability of diyabc to recover the correct model was relatively low (0.49): 0.6 for the simplest models and 0.3 for the more complex models. The recovery rate improved significantly when reducing the number of candidate models from five to three (from 0.57 to 0.71). Among the parameters of interest, the effective population size was estimated at a higher accuracy compared to the timing of events. Increased amounts of genetic data did not significantly improve the accuracy of the parameter estimates. Some gains in accuracy and decreases in error were observed for scaled parameters (e.g., Neμ) compared to unscaled parameters (e.g., Ne and μ). We concluded that diyabc ‐based assessments are not suited to capture a detailed demographic history, but might be efficient at capturing simple, major demographic changes.  相似文献   

19.
Estimating nonlinear dose‐response relationships in the context of pharmaceutical clinical trials is often a challenging problem. The data in these trials are typically variable and sparse, making this a hard inference problem, despite sometimes seemingly large sample sizes. Maximum likelihood estimates often fail to exist in these situations, while for Bayesian methods, prior selection becomes a delicate issue when no carefully elicited prior is available, as the posterior distribution will often be sensitive to the priors chosen. This article provides guidance on the usage of functional uniform prior distributions in these situations. The essential idea of functional uniform priors is to employ a distribution that weights the functional shapes of the nonlinear regression function equally. By doing so one obtains a distribution that exhaustively and uniformly covers the underlying potential shapes of the nonlinear function. On the parameter scale these priors will often result in quite nonuniform prior distributions. This paper gives hints on how to implement these priors in practice and illustrates them in realistic trial examples in the context of Phase II dose‐response trials as well as Phase I first‐in‐human studies.  相似文献   

20.
Numerous simulation studies have investigated the accuracy of phylogenetic inference of gene trees under maximum parsimony, maximum likelihood, and Bayesian techniques. The relative accuracy of species tree inference methods under simulation has received less study. The number of analytical techniques available for inferring species trees is increasing rapidly, and in this paper, we compare the performance of several species tree inference techniques at estimating recent species divergences using computer simulation. Simulating gene trees within species trees of different shapes and with varying tree lengths (T) and population sizes (), and evolving sequences on those gene trees, allows us to determine how phylogenetic accuracy changes in relation to different levels of deep coalescence and phylogenetic signal. When the probability of discordance between the gene trees and the species tree is high (i.e., T is small and/or is large), Bayesian species tree inference using the multispecies coalescent (BEST) outperforms other methods. The performance of all methods improves as the total length of the species tree is increased, which reflects the combined benefits of decreasing the probability of discordance between species trees and gene trees and gaining more accurate estimates for gene trees. Decreasing the probability of deep coalescences by reducing also leads to accuracy gains for most methods. Increasing the number of loci from 10 to 100 improves accuracy under difficult demographic scenarios (i.e., coalescent units ≤ 4N(e)), but 10 loci are adequate for estimating the correct species tree in cases where deep coalescence is limited or absent. In general, the correlation between the phylogenetic accuracy and the posterior probability values obtained from BEST is high, although posterior probabilities are overestimated when the prior distribution for is misspecified.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号