首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Nathan P. Lemoine 《Oikos》2019,128(7):912-928
Throughout the last two decades, Bayesian statistical methods have proliferated throughout ecology and evolution. Numerous previous references established both philosophical and computational guidelines for implementing Bayesian methods. However, protocols for incorporating prior information, the defining characteristic of Bayesian philosophy, are nearly nonexistent in the ecological literature. Here, I hope to encourage the use of weakly informative priors in ecology and evolution by providing a ‘consumer's guide’ to weakly informative priors. The first section outlines three reasons why ecologists should abandon noninformative priors: 1) common flat priors are not always noninformative, 2) noninformative priors provide the same result as simpler frequentist methods, and 3) noninformative priors suffer from the same high type I and type M error rates as frequentist methods. The second section provides a guide for implementing informative priors, wherein I detail convenient ‘reference’ prior distributions for common statistical models (i.e. regression, ANOVA, hierarchical models). I then use simulations to visually demonstrate how informative priors influence posterior parameter estimates. With the guidelines provided here, I hope to encourage the use of weakly informative priors for Bayesian analyses in ecology. Ecologists can and should debate the appropriate form of prior information, but should consider weakly informative priors as the new ‘default’ prior for any Bayesian model.  相似文献   

2.
1. Informative Bayesian priors can improve the precision of estimates in ecological studies or estimate parameters for which little or no information is available. While Bayesian analyses are becoming more popular in ecology, the use of strongly informative priors remains rare, perhaps because examples of informative priors are not readily available in the published literature. 2. Dispersal distance is an important ecological parameter, but is difficult to measure and estimates are scarce. General models that provide informative prior estimates of dispersal distances will therefore be valuable. 3. Using a world-wide data set on birds, we develop a predictive model of median natal dispersal distance that includes body mass, wingspan, sex and feeding guild. This model predicts median dispersal distance well when using the fitted data and an independent test data set, explaining up to 53% of the variation. 4. Using this model, we predict a priori estimates of median dispersal distance for 57 woodland-dependent bird species in northern Victoria, Australia. These estimates are then used to investigate the relationship between dispersal ability and vulnerability to landscape-scale changes in habitat cover and fragmentation. 5. We find evidence that woodland bird species with poor predicted dispersal ability are more vulnerable to habitat fragmentation than those species with longer predicted dispersal distances, thus improving the understanding of this important phenomenon. 6. The value of constructing informative priors from existing information is also demonstrated. When used as informative priors for four example species, predicted dispersal distances reduced the 95% credible intervals of posterior estimates of dispersal distance by 8-19%. Further, should we have wished to collect information on avian dispersal distances and relate it to species' responses to habitat loss and fragmentation, data from 221 individuals across 57 species would have been required to obtain estimates with the same precision as those provided by the general model.  相似文献   

3.
Pilot studies are often used to help design ecological studies. Ideally the pilot data are incorporated into the full-scale study data, but if the pilot study's results indicate a need for major changes to experimental design, then pooling pilot and full-scale study data is difficult. The default position is to disregard the preliminary data. But ignoring pilot study data after a more comprehensive study has been completed forgoes statistical power or costs more by sampling additional data equivalent to the pilot study's sample size. With Bayesian methods, pilot study data can be used as an informative prior for a model built from the full-scale study dataset. We demonstrate a Bayesian method for recovering information from otherwise unusable pilot study data with a case study on eucalypt seedling mortality. A pilot study of eucalypt tree seedling mortality was conducted in southeastern Australia in 2005. A larger study with a modified design was conducted the following year. The two datasets differed substantially, so they could not easily be combined. Posterior estimates from pilot dataset model parameters were used to inform a model for the second larger dataset. Model checking indicated that incorporating prior information maintained the predictive capacity of the model with respect to the training data. Importantly, adding prior information improved model accuracy in predicting a validation dataset. Adding prior information increased the precision and the effective sample size for estimating the average mortality rate. We recommend that practitioners move away from the default position of discarding pilot study data when they are incompatible with the form of their full-scale studies. More generally, we recommend that ecologists should use informative priors more frequently to reap the benefits of the additional data.  相似文献   

4.
There are not “universal methods” to determine diet composition of predators. Most traditional methods are biased because of their reliance on differential digestibility and the recovery of hard items. By relying on assimilated food, stable isotope and Bayesian mixing models (SIMMs) resolve many biases of traditional methods. SIMMs can incorporate prior information (i.e. proportional diet composition) that may improve the precision in the estimated dietary composition. However few studies have assessed the performance of traditional methods and SIMMs with and without informative priors to study the predators’ diets. Here we compare the diet compositions of the South American fur seal and sea lions obtained by scats analysis and by SIMMs-UP (uninformative priors) and assess whether informative priors (SIMMs-IP) from the scat analysis improved the estimated diet composition compared to SIMMs-UP. According to the SIMM-UP, while pelagic species dominated the fur seal’s diet the sea lion’s did not have a clear dominance of any prey. In contrast, SIMM-IP’s diets compositions were dominated by the same preys as in scat analyses. When prior information influenced SIMMs’ estimates, incorporating informative priors improved the precision in the estimated diet composition at the risk of inducing biases in the estimates. If preys isotopic data allow discriminating preys’ contributions to diets, informative priors should lead to more precise but unbiased estimated diet composition. Just as estimates of diet composition obtained from traditional methods are critically interpreted because of their biases, care must be exercised when interpreting diet composition obtained by SIMMs-IP. The best approach to obtain a near-complete view of predators’ diet composition should involve the simultaneous consideration of different sources of partial evidence (traditional methods, SIMM-UP and SIMM-IP) in the light of natural history of the predator species so as to reliably ascertain and weight the information yielded by each method.  相似文献   

5.
Monitoring is an essential part of reintroduction programs, but many years of data may be needed to obtain reliable population projections. This duration can potentially be reduced by incorporating prior information on expected vital rates (survival and fecundity) when making inferences from monitoring data. The prior distributions for these parameters can be derived from data for previous reintroductions, but it is important to account for site‐to‐site variation. We evaluated whether such informative priors improved our ability to estimate the finite rate of increase (λ) of the North Island robin (Petroica longipes) population reintroduced to Tawharanui Regional Park, New Zealand. We assessed how precision improved with each year of postrelease data added, comparing models that used informative or uninformative priors. The population grew from about 22 to 80 individuals from 2007 to 2016, with λ estimated to be 1.23 if density dependence was included in the model and 1.13 otherwise. Under either model, 7 years of data were required before the lower 95% credible limit for λ was > 1, giving confidence that the population would persist. The informative priors did not reduce this requirement. Data‐derived priors are useful before reintroduction because they allow λ to be estimated in advance. However, in the case examined here, the value of the priors was overwhelmed once site‐specific monitoring data became available. The Bayesian method presented is logical for reintroduced populations. It allows prior information (used to inform prerelease decisions) to be integrated with postrelease monitoring. This makes full use of the data for ongoing management decisions. However, if the priors properly account for site‐to‐site variation, they may have little predictive value compared with the site‐specific data. This value will depend on the degree of site‐to‐site variation as well as the quality of the data.  相似文献   

6.
Obtaining useful estimates of wildlife abundance or density requires thoughtful attention to potential sources of bias and precision, and it is widely understood that addressing incomplete detection is critical to appropriate inference. When the underlying assumptions of sampling approaches are violated, both increased bias and reduced precision of the population estimator may result. Bear (Ursus spp.) populations can be difficult to sample and are often monitored using mark‐recapture distance sampling (MRDS) methods, although obtaining adequate sample sizes can be cost prohibitive. With the goal of improving inference, we examined the underlying methodological assumptions and estimator efficiency of three datasets collected under an MRDS protocol designed specifically for bears. We analyzed these data using MRDS, conventional distance sampling (CDS), and open‐distance sampling approaches to evaluate the apparent bias‐precision tradeoff relative to the assumptions inherent under each approach. We also evaluated the incorporation of informative priors on detection parameters within a Bayesian context. We found that the CDS estimator had low apparent bias and was more efficient than the more complex MRDS estimator. When combined with informative priors on the detection process, precision was increased by >50% compared to the MRDS approach with little apparent bias. In addition, open‐distance sampling models revealed a serious violation of the assumption that all bears were available to be sampled. Inference is directly related to the underlying assumptions of the survey design and the analytical tools employed. We show that for aerial surveys of bears, avoidance of unnecessary model complexity, use of prior information, and the application of open population models can be used to greatly improve estimator performance and simplify field protocols. Although we focused on distance sampling‐based aerial surveys for bears, the general concepts we addressed apply to a variety of wildlife survey contexts.  相似文献   

7.
Cheung YK 《Biometrics》2002,58(1):237-240
Gasparini and Eisele (2000, Biometrics 56, 609-615) propose a design for phase I clinical trials during which dose allocation is governed by a Bayesian nonparametric estimate of the dose-response curve. The authors also suggest an elicitation algorithm to establish vague priors. However, in situations where a low percentile is targeted, priors thus obtained can lead to undesirable rigidity given certain trial outcomes that can occur with a nonnegligible probability. Interestingly, improvement can be achieved by prescribing slightly more informative priors. Some guidelines for prior elicitation are established using a connection between this curve-free method and the continual reassessment method.  相似文献   

8.
Individualized anatomical information has been used as prior knowledge in Bayesian inference paradigms of whole-brain network models. However, the actual sensitivity to such personalized information in priors is still unknown. In this study, we introduce the use of fully Bayesian information criteria and leave-one-out cross-validation technique on the subject-specific information to assess different epileptogenicity hypotheses regarding the location of pathological brain areas based on a priori knowledge from dynamical system properties. The Bayesian Virtual Epileptic Patient (BVEP) model, which relies on the fusion of structural data of individuals, a generative model of epileptiform discharges, and a self-tuning Monte Carlo sampling algorithm, is used to infer the spatial map of epileptogenicity across different brain areas. Our results indicate that measuring the out-of-sample prediction accuracy of the BVEP model with informative priors enables reliable and efficient evaluation of potential hypotheses regarding the degree of epileptogenicity across different brain regions. In contrast, while using uninformative priors, the information criteria are unable to provide strong evidence about the epileptogenicity of brain areas. We also show that the fully Bayesian criteria correctly assess different hypotheses about both structural and functional components of whole-brain models that differ across individuals. The fully Bayesian information-theory based approach used in this study suggests a patient-specific strategy for epileptogenicity hypothesis testing in generative brain network models of epilepsy to improve surgical outcomes.  相似文献   

9.
Mixed models are now well‐established methods in ecology and evolution because they allow accounting for and quantifying within‐ and between‐individual variation. However, the required normal distribution of the random effects can often be violated by the presence of clusters among subjects, which leads to multi‐modal distributions. In such cases, using what is known as mixture regression models might offer a more appropriate approach. These models are widely used in psychology, sociology, and medicine to describe the diversity of trajectories occurring within a population over time (e.g. psychological development, growth). In ecology and evolution, however, these models are seldom used even though understanding changes in individual trajectories is an active area of research in life‐history studies. Our aim is to demonstrate the value of using mixture models to describe variation in individual life‐history tactics within a population, and hence to promote the use of these models by ecologists and evolutionary ecologists. We first ran a set of simulations to determine whether and when a mixture model allows teasing apart latent clustering, and to contrast the precision and accuracy of estimates obtained from mixture models versus mixed models under a wide range of ecological contexts. We then used empirical data from long‐term studies of large mammals to illustrate the potential of using mixture models for assessing within‐population variation in life‐history tactics. Mixture models performed well in most cases, except for variables following a Bernoulli distribution and when sample size was small. The four selection criteria we evaluated [Akaike information criterion (AIC), Bayesian information criterion (BIC), and two bootstrap methods] performed similarly well, selecting the right number of clusters in most ecological situations. We then showed that the normality of random effects implicitly assumed by evolutionary ecologists when using mixed models was often violated in life‐history data. Mixed models were quite robust to this violation in the sense that fixed effects were unbiased at the population level. However, fixed effects at the cluster level and random effects were better estimated using mixture models. Our empirical analyses demonstrated that using mixture models facilitates the identification of the diversity of growth and reproductive tactics occurring within a population. Therefore, using this modelling framework allows testing for the presence of clusters and, when clusters occur, provides reliable estimates of fixed and random effects for each cluster of the population. In the presence or expectation of clusters, using mixture models offers a suitable extension of mixed models, particularly when evolutionary ecologists aim at identifying how ecological and evolutionary processes change within a population. Mixture regression models therefore provide a valuable addition to the statistical toolbox of evolutionary ecologists. As these models are complex and have their own limitations, we provide recommendations to guide future users.  相似文献   

10.
To study lifetimes of certain engineering processes, a lifetime model which can accommodate the nature of such processes is desired. The mixture models of underlying lifetime distributions are intuitively more appropriate and appealing to model the heterogeneous nature of process as compared to simple models. This paper is about studying a 3-component mixture of the Rayleigh distributionsin Bayesian perspective. The censored sampling environment is considered due to its popularity in reliability theory and survival analysis. The expressions for the Bayes estimators and their posterior risks are derived under different scenarios. In case the case that no or little prior information is available, elicitation of hyperparameters is given. To examine, numerically, the performance of the Bayes estimators using non-informative and informative priors under different loss functions, we have simulated their statistical properties for different sample sizes and test termination times. In addition, to highlight the practical significance, an illustrative example based on a real-life engineering data is also given.  相似文献   

11.
When a dataset is imbalanced, the prediction of the scarcely-sampled subpopulation can be over-influenced by the population contributing to the majority of the data. The aim of this study was to develop a Bayesian modelling approach with balancing informative prior so that the influence of imbalance to the overall prediction could be minimised. The new approach was developed in order to weigh the data in favour of the smaller subset(s). The method was assessed in terms of bias and precision in predicting model parameter estimates of simulated datasets. Moreover, the method was evaluated in predicting optimal dose levels of tobramycin for various age groups in a motivating example. The bias estimates using the balancing informative prior approach were smaller than those generated using the conventional approach which was without the consideration for the imbalance in the datasets. The precision estimates were also superior. The method was further evaluated in a motivating example of optimal dosage prediction of tobramycin. The resulting predictions also agreed well with what had been reported in the literature. The proposed Bayesian balancing informative prior approach has shown a real potential to adequately weigh the data in favour of smaller subset(s) of data to generate robust prediction models.  相似文献   

12.
Agresti A  Min Y 《Biometrics》2005,61(2):515-523
This article investigates the performance, in a frequentist sense, of Bayesian confidence intervals (CIs) for the difference of proportions, relative risk, and odds ratio in 2 x 2 contingency tables. We consider beta priors, logit-normal priors, and related correlated priors for the two binomial parameters. The goal was to analyze whether certain settings for prior parameters tend to provide good coverage performance regardless of the true association parameter values. For the relative risk and odds ratio, we recommend tail intervals over highest posterior density (HPD) intervals, for invariance reasons. To protect against potentially very poor coverage probabilities when the effect is large, it is best to use a diffuse prior, and we recommend the Jeffreys prior. Otherwise, with relatively small samples, Bayesian CIs using more informative (even uniform) priors tend to have poorer performance than the frequentist CIs based on inverting score tests, which perform uniformly quite well for these parameters.  相似文献   

13.
Stable isotope analysis of diet has become a common tool in conservation research. However, the multiple sources of uncertainty inherent in this analysis framework involve consequences that have not been thoroughly addressed. Uncertainty arises from the choice of trophic discrimination factors, and for Bayesian stable isotope mixing models (SIMMs), the specification of prior information; the combined effect of these aspects has not been explicitly tested. We used a captive feeding study of gray wolves (Canis lupus) to determine the first experimentally-derived trophic discrimination factors of C and N for this large carnivore of broad conservation interest. Using the estimated diet in our controlled system and data from a published study on wild wolves and their prey in Montana, USA, we then investigated the simultaneous effect of discrimination factors and prior information on diet reconstruction with Bayesian SIMMs. Discrimination factors for gray wolves and their prey were 1.97‰ for δ13C and 3.04‰ for δ15N. Specifying wolf discrimination factors, as opposed to the commonly used red fox (Vulpes vulpes) factors, made little practical difference to estimates of wolf diet, but prior information had a strong effect on bias, precision, and accuracy of posterior estimates. Without specifying prior information in our Bayesian SIMM, it was not possible to produce SIMM posteriors statistically similar to the estimated diet in our controlled study or the diet of wild wolves. Our study demonstrates the critical effect of prior information on estimates of animal diets using Bayesian SIMMs, and suggests species-specific trophic discrimination factors are of secondary importance. When using stable isotope analysis to inform conservation decisions researchers should understand the limits of their data. It may be difficult to obtain useful information from SIMMs if informative priors are omitted and species-specific discrimination factors are unavailable.  相似文献   

14.
Species distribution modeling (SDM) is an essential method in ecology and conservation. SDMs are often calibrated within one country's borders, typically along a limited environmental gradient with biased and incomplete data, making the quality of these models questionable. In this study, we evaluated how adequate are national presence‐only data for calibrating regional SDMs. We trained SDMs for Egyptian bat species at two different scales: only within Egypt and at a species‐specific global extent. We used two modeling algorithms: Maxent and elastic net, both under the point‐process modeling framework. For each modeling algorithm, we measured the congruence of the predictions of global and regional models for Egypt, assuming that the lower the congruence, the lower the appropriateness of the Egyptian dataset to describe the species' niche. We inspected the effect of incorporating predictions from global models as additional predictor (“prior”) to regional models, and quantified the improvement in terms of AUC and the congruence between regional models run with and without priors. Moreover, we analyzed predictive performance improvements after correction for sampling bias at both scales. On average, predictions from global and regional models in Egypt only weakly concur. Collectively, the use of priors did not lead to much improvement: similar AUC and high congruence between regional models calibrated with and without priors. Correction for sampling bias led to higher model performance, whatever prior used, making the use of priors less pronounced. Under biased and incomplete sampling, the use of global bats data did not improve regional model performance. Without enough bias‐free regional data, we cannot objectively identify the actual improvement of regional models after incorporating information from the global niche. However, we still believe in great potential for global model predictions to guide future surveys and improve regional sampling in data‐poor regions.  相似文献   

15.
Establishing that a set of population‐splitting events occurred at the same time can be a potentially persuasive argument that a common process affected the populations. Recently, Oaks et al. ( 2013 ) assessed the ability of an approximate‐Bayesian model‐choice method (msBayes ) to estimate such a pattern of simultaneous divergence across taxa, to which Hickerson et al. ( 2014 ) responded. Both papers agree that the primary inference enabled by the method is very sensitive to prior assumptions and often erroneously supports shared divergences across taxa when prior uncertainty about divergence times is represented by a uniform distribution. However, the papers differ about the best explanation and solution for this problem. Oaks et al. ( 2013 ) suggested the method's behavior was caused by the strong weight of uniformly distributed priors on divergence times leading to smaller marginal likelihoods (and thus smaller posterior probabilities) of models with more divergence‐time parameters (Hypothesis 1); they proposed alternative prior probability distributions to avoid such strongly weighted posteriors. Hickerson et al. ( 2014 ) suggested numerical‐approximation error causes msBayes analyses to be biased toward models of clustered divergences because the method's rejection algorithm is unable to adequately sample the parameter space of richer models within reasonable computational limits when using broad uniform priors on divergence times (Hypothesis 2). As a potential solution, they proposed a model‐averaging approach that uses narrow, empirically informed uniform priors. Here, we use analyses of simulated and empirical data to demonstrate that the approach of Hickerson et al. ( 2014 ) does not mitigate the method's tendency to erroneously support models of highly clustered divergences, and is dangerous in the sense that the empirically derived uniform priors often exclude from consideration the true values of the divergence‐time parameters. Our results also show that the tendency of msBayes analyses to support models of shared divergences is primarily due to Hypothesis 1, whereas Hypothesis 2 is an untenable explanation for the bias. Overall, this series of papers demonstrates that if our prior assumptions place too much weight in unlikely regions of parameter space such that the exact posterior supports the wrong model of evolutionary history, no amount of computation can rescue our inference. Fortunately, as predicted by fundamental principles of Bayesian model choice, more flexible distributions that accommodate prior uncertainty about parameters without placing excessive weight in vast regions of parameter space with low likelihood increase the method's robustness and power to detect temporal variation in divergences.  相似文献   

16.
Bayesian inference allows the transparent communication and systematic updating of model uncertainty as new data become available. When applied to material flow analysis (MFA), however, Bayesian inference is undermined by the difficulty of defining proper priors for the MFA parameters and quantifying the noise in the collected data. We start to address these issues by first deriving and implementing an expert elicitation procedure suitable for generating MFA parameter priors. Second, we propose to learn the data noise concurrent with the parametric uncertainty. These methods are demonstrated using a case study on the 2012 US steel flow. Eight experts are interviewed to elicit distributions on steel flow uncertainty from raw materials to intermediate goods. The experts' distributions are combined and weighted according to the expertise demonstrated in response to seeding questions. These aggregated distributions form our model parameters' informative priors. Sensible, weakly informative priors are adopted for learning the data noise. Bayesian inference is then performed to update the parametric and data noise uncertainty given MFA data collected from the United States Geological Survey and the World Steel Association. The results show a reduction in MFA parametric uncertainty when incorporating the collected data. Only a modest reduction in data noise uncertainty was observed using 2012 data; however, greater reductions were achieved when using data from multiple years in the inference. These methods generate transparent MFA and data noise uncertainties learned from data rather than pre-assumed data noise levels, providing a more robust basis for decision-making that affects the system.  相似文献   

17.
ABSTRACT: BACKGROUND: An important question in the analysis of biochemical data is that of identifying subsets of molecular variables that may jointly influence a biological response. Statistical variable selection methods have been widely used for this purpose. In many settings, it may be important to incorporate ancillary biological information concerning the variables of interest. Pathway and network maps are one example of a source of such information. However, although ancillary information is increasingly available, it is not always clear how it should be used nor how it should be weighted in relation to primary data. RESULTS: We put forward an approach in which biological knowledge is incorporated using informative prior distributions over variable subsets, with prior information selected and weighted in an automated, objective manner using an empirical Bayes formulation. We employ continuous, linear models with interaction terms and exploit biochemically-motivated sparsity constraints to permit exact inference. We show an example of priors for pathway- and network-based information and illustrate our proposed method on both synthetic response data and by an application to cancer drug response data. Comparisons are also made to alternative Bayesian and frequentist penalised-likelihood methods for incorporating network-based information. CONCLUSIONS: The empirical Bayes method proposed here can aid prior elicitation for Bayesian variable selection studies and help to guard against mis-specification of priors. Empirical Bayes, together with the proposed pathway-based priors, results in an approach with a competitive variable selection performance. In addition, the overall procedure is fast, deterministic, and has very few user-set parameters, yet is capable of capturing interplay between molecular players. The approach presented is general and readily applicable in any setting with multiple sources of biological prior knowledge.  相似文献   

18.
Millar RB 《Biometrics》2004,60(2):536-542
Priors are seldom unequivocal and an important component of Bayesian modeling is assessment of the sensitivity of the posterior to the specified prior distribution. This is especially true in fisheries science where the Bayesian approach has been promoted as a rigorous method for including existing information from previous surveys and from related stocks or species. These informative priors may be highly contested by various interest groups. Here, formulae for the first and second derivatives of Bayes estimators with respect to hyper-parameters of the joint prior density are given. The formula for the second derivative provides a correction to a previously published result. The formulae are shown to reduce to very convenient and easily implemented forms when the hyper-parameters are for exponential family marginal priors. For model parameters with such priors it is shown that the ratio of posterior variance to prior variance can be interpreted as the sensitivity of the posterior mean to the prior mean. This methodology is applied to a nonlinear state-space model for the biomass of South Atlantic albacore tuna and sensitivity of the maximum sustainable yield to the prior specification is examined.  相似文献   

19.
A common concern in Bayesian data analysis is that an inappropriately informative prior may unduly influence posterior inferences. In the context of Bayesian clinical trial design, well chosen priors are important to ensure that posterior-based decision rules have good frequentist properties. However, it is difficult to quantify prior information in all but the most stylized models. This issue may be addressed by quantifying the prior information in terms of a number of hypothetical patients, i.e., a prior effective sample size (ESS). Prior ESS provides a useful tool for understanding the impact of prior assumptions. For example, the prior ESS may be used to guide calibration of prior variances and other hyperprior parameters. In this paper, we discuss such prior sensitivity analyses by using a recently proposed method to compute a prior ESS. We apply this in several typical settings of Bayesian biomedical data analysis and clinical trial design. The data analyses include cross-tabulated counts, multiple correlated diagnostic tests, and ordinal outcomes using a proportional-odds model. The study designs include a phase I trial with late-onset toxicities, a phase II trial that monitors event times, and a phase I/II trial with dose-finding based on efficacy and toxicity.  相似文献   

20.
In popular use of Bayesian phylogenetics, a default branch-length prior is almost universally applied without knowing how a different prior would have affected the outcome. We performed Bayesian and maximum likelihood (ML) inference of phylogeny based on empirical nucleotide sequence data from a family of lichenized ascomycetes, the Psoraceae, the morphological delimitation of which has been controversial. We specifically assessed the influence of the combination of Bayesian branch-length prior and likelihood model on the properties of the Markov chain Monte Carlo tree sample, including node support, branch lengths, and taxon stability. Data included two regions of the mitochondrial ribosomal RNA gene, the internal transcribed spacer region of the nuclear ribosomal RNA gene, and the protein-coding largest subunit of RNA polymerase II. Data partitioning was performed using Bayes' factors, whereas the best-fitting model of each partition was selected using the Bayesian information criterion (BIC). Given the data and model, short Bayesian branch-length priors generate higher numbers of strongly supported nodes as well as short and topologically similar trees sampled from parts of tree space that are largely unexplored by the ML bootstrap. Long branch-length priors generate fewer strongly supported nodes and longer and more dissimilar trees that are sampled mostly from inside the range of tree space sampled by the ML bootstrap. Priors near the ML distribution of branch lengths generate the best marginal likelihood and the highest frequency of "rogue" (unstable) taxa. The branch-length prior was shown to interact with the likelihood model. Trees inferred under complex partitioned models are more affected by the stretching effect of the branch-length prior. Fewer nodes are strongly supported under a complex model given the same branch-length prior. Irrespective of model, internal branches make up a larger proportion of total tree length under the shortest branch-length priors compared with longer priors. Relative effects on branch lengths caused by the branch-length prior can be problematic to downstream phylogenetic comparative methods making use of the branch lengths. Furthermore, given the same branch-length prior, trees are on average more dissimilar under a simple unpartitioned model compared with a more complex partitioned models. The distribution of ML branch lengths was shown to better fit a gamma or Pareto distribution than an exponential one. Model adequacy tests indicate that the best-fitting model selected by the BIC is insufficient for describing data patterns in 5 of 8 partitions. More general substitution models are required to explain the data in three of these partitions, one of which also requires nonstationarity. The two mitochondrial ribosomal RNA gene partitions need heterotachous models. We found no significant correlations between, on the one hand, the amount of ambiguous data or the smallest branch-length distance to another taxon and, on the other hand, the topological stability of individual taxa. Integrating over several exponentially distributed means under the best-fitting model, node support for the family Psoraceae, including Psora, Protoblastenia, and the Micarea sylvicola group, is approximately 0.96. Support for the genus Psora is distinctly lower, but we found no evidence to contradict the current classification.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号