共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
Tanabe AS 《Molecular ecology resources》2011,11(5):914-921
Proportional and separate models able to apply different combination of substitution rate matrix (SRM) and among-site rate variation model (ASRVM) to each locus are frequently used in phylogenetic studies of multilocus data. A proportional model assumes that branch lengths are proportional among partitions and a separate model assumes that each partition has an independent set of branch lengths. However, the selection from among nonpartitioned (i.e., a common combination of models is applied to all-loci concatenated sequences), proportional and separate models is usually based on the researcher's preference rather than on any information criteria. This study describes two programs, 'Kakusan4' (for DNA sequences) and 'Aminosan' (for amino-acid sequences), which allow the selection of evolutionary models based on several types of information criteria. The programs can handle both multilocus and single-locus data, in addition to providing an easy-to-use wizard interface and a noninteractive command line interface. In the case of multilocus data, SRMs and ASRVMs are compared at each locus and at all-loci concatenated sequences, after which nonpartitioned, proportional and separate models are compared based on information criteria. The programs also provide model configuration files for mrbayes, paup*, phyml, raxml and Treefinder to support further phylogenetic analysis using a selected model. When likelihoods are optimized by Treefinder, the best-fit models were found to differ depending on the data set. Furthermore, differences in the information criteria among nonpartitioned, proportional and separate models were much larger than those among the nonpartitioned models. These findings suggest that selecting from nonpartitioned, proportional and separate models results in a better phylogenetic tree. Kakusan4 and Aminosan are available at http://www.fifthdimension.jp/. They are licensed under gnugpl Ver.2, and are able to run on Windows, MacOS X and Linux. 相似文献
4.
Tichit M Barbottin A Makowski D 《Animal : an international journal of animal bioscience》2010,4(6):819-826
In response to environmental threats, numerous indicators have been developed to assess the impact of livestock farming systems on the environment. Some of them, notably those based on management practices have been reported to have low accuracy. This paper reports the results of a study aimed at assessing whether accuracy can be increased at a reasonable cost by mixing individual indicators into models. We focused on proxy indicators representing an alternative to the direct impact measurement on two grassland bird species, the lapwing Vanellus vanellus and the redshank Tringa totanus. Models were developed using stepwise selection procedures or Bayesian model averaging (BMA). Sensitivity, specificity, and probability of correctly ranking fields (area under the curve, AUC) were estimated for each individual indicator or model from observational data measured on 252 grazed plots during 2 years. The cost of implementation of each model was computed as a function of the number and types of input variables. Among all management indicators, 50% had an AUC lower than or equal to 0.50 and thus were not better than a random decision. Independently of the statistical procedure, models combining management indicators were always more accurate than individual indicators for lapwings only. In redshanks, models based either on BMA or some selection procedures were non-informative. Higher accuracy could be reached, for both species, with model mixing management and habitat indicators. However, this increase in accuracy was also associated with an increase in model cost. Models derived by BMA were more expensive and slightly less accurate than those derived with selection procedures. Analysing trade-offs between accuracy and cost of indicators opens promising application perspectives as time consuming and expensive indicators are likely to be of low practical utility. 相似文献
5.
A popular approach to detecting positive selection is to estimate the parameters of a probabilistic model of codon evolution and perform inference based on its maximum likelihood parameter values. This approach has been evaluated intensively in a number of simulation studies and found to be robust when the available data set is large. However, uncertainties in the estimated parameter values can lead to errors in the inference, especially when the data set is small or there is insufficient divergence between the sequences. We introduce a Bayesian model comparison approach to infer whether the sequence as a whole contains sites at which the rate of nonsynonymous substitution is greater than the rate of synonymous substitution. We incorporated this probabilistic model comparison into a Bayesian approach to site-specific inference of positive selection. Using simulated sequences, we compared this approach to the commonly used empirical Bayes approach and investigated the effect of tree length on the performance of both methods. We found that the Bayesian approach outperforms the empirical Bayes method when the amount of sequence divergence is small and is less prone to false-positive inference when the sequences are saturated, while the results are indistinguishable for intermediate levels of sequence divergence. 相似文献
6.
Inferring the demographic history of species and their populations is crucial to understand their contemporary distribution, abundance and adaptations. The high computational overhead of likelihood‐based inference approaches severely restricts their applicability to large data sets or complex models. In response to these restrictions, approximate Bayesian computation (ABC) methods have been developed to infer the demographic past of populations and species. Here, we present the results of an evaluation of the ABC‐based approach implemented in the popular software package diyabc using simulated data sets (mitochondrial DNA sequences, microsatellite genotypes and single nucleotide polymorphisms). We simulated population genetic data under five different simple, single‐population models to assess the model recovery rates as well as the bias and error of the parameter estimates. The ability of diyabc to recover the correct model was relatively low (0.49): 0.6 for the simplest models and 0.3 for the more complex models. The recovery rate improved significantly when reducing the number of candidate models from five to three (from 0.57 to 0.71). Among the parameters of interest, the effective population size was estimated at a higher accuracy compared to the timing of events. Increased amounts of genetic data did not significantly improve the accuracy of the parameter estimates. Some gains in accuracy and decreases in error were observed for scaled parameters (e.g., Neμ) compared to unscaled parameters (e.g., Ne and μ). We concluded that diyabc ‐based assessments are not suited to capture a detailed demographic history, but might be efficient at capturing simple, major demographic changes. 相似文献
7.
Ralph Mac Nally 《Biodiversity and Conservation》2000,9(5):655-671
In many large-scale conservation or ecological problems where experiments are intractable or unethical, regression methods are used to attempt to gauge the impact of a set of nominally independent variables (X) upon a dependent variable (Y). Workers often want to assert that a given X has a major influence on Y, and so, by using this indirection to infer a probable causal relationship. There are two difficulties apart from the demonstrability issue itself: (1) multiple regression is plagued by collinear relationships in X; and (2) any regression is designed to produce a function that in some way minimizes the overall difference between the observed and predicted Ys, which does not necessarily equate to determining probable influence in a multivariate setting. Problem (1) may be explored by comparing two avenues, one in which a single best regression model is sought and the other where all possible regression models are considered contemporaneously. It is suggested that if the two approaches do not agree upon which of the independent variables are likely to be significant, then the deductions must be subject to doubt. 相似文献
8.
In many cell types, the inositol trisphosphate receptor (IPR) is one of the important components that control intracellular calcium dynamics, and an understanding of this receptor (which is also a calcium channel) is necessary for an understanding of calcium oscillations and waves. Recent advances in experimental techniques now allow for the measurement of single-channel activity of the IPR in conditions similar to its native environment, and these data can be used to determine the rate constants in Markov models of the IPR. We illustrate a parameter estimation method based on Markov chain Monte Carlo, which can be used to fit directly to single-channel data, and determining, as an intrinsic part of the fit, the times at which the IPR is opening and closing. We show, using simulated data, the most complex Markov model that can be unambiguously determined from steady-state data and show that non-steady-state data is required to determine more complex models. 相似文献
9.
AKIFUMI S. TANABE 《Molecular ecology resources》2007,7(6):962-964
The application of different substitution models to each gene (a.k.a. mixed model) should be considered in model‐based phylogenetic analysis of multigene sequences. However, a single molecular evolution model is still usually applied. There are no computer programs able to conduct model selection for multiple loci at the same time, though several recently developed types of software for phylogenetic inference can handle mixed model. Here, I have developed computer software named ‘kakusan’ that enables us to solve the above problems. Major running steps are briefly described, and an analysis of results with kakusan is compared to that obtained with other program. 相似文献
10.
11.
Kevin Gross William F. Morris Michael S. Wolosin Daniel F. Doak 《Population Ecology》2006,48(1):79-89
Population projection matrices are commonly used by ecologists and managers to analyze the dynamics of stage-structured populations. Building projection matrices from data requires estimating transition rates among stages, a task that often entails estimating many parameters with few data. Consequently, large sampling variability in the estimated transition rates increases the uncertainty in the estimated matrix and quantities derived from it, such as the population multiplication rate and sensitivities of matrix elements. Here, we propose a strategy to avoid overparameterized matrix models. This strategy involves fitting models to the vital rates that determine matrix elements, evaluating both these models and ones that estimate matrix elements individually with model selection via information criteria, and averaging competing models with multimodel averaging. We illustrate this idea with data from a population of Silene acaulis (Caryophyllaceae), and conduct a simulation to investigate the statistical properties of the matrices estimated in this way. The simulation shows that compared with estimating matrix elements individually, building population projection matrices by fitting and averaging models of vital-rate estimates can reduce the statistical error in the population projection matrix and quantities derived from it. 相似文献
12.
In this article, we develop a latent class model with class probabilities that depend on subject-specific covariates. One of our major goals is to identify important predictors of latent classes. We consider methodology that allows estimation of latent classes while allowing for variable selection uncertainty. We propose a Bayesian variable selection approach and implement a stochastic search Gibbs sampler for posterior computation to obtain model-averaged estimates of quantities of interest such as marginal inclusion probabilities of predictors. Our methods are illustrated through simulation studies and application to data on weight gain during pregnancy, where it is of interest to identify important predictors of latent weight gain classes. 相似文献
13.
J. W. Durban D. A. Elston D. K. Ellifrit E. Dickson P. S. Hammond P. M. Thompson 《Marine Mammal Science》2005,21(1):80-92
Mark-recapture techniques are widely used to estimate the size of wildlife populations. However, in cetacean photo-identification studies, it is often impractical to sample across the entire range of the population. Consequently, negatively biased population estimates can result when large portions of a population are unavailable for photographic capture. To overcome this problem, we propose that individuals be sampled from a number of discrete sites located throughout the population's range. The recapture of individuals between sites can then be presented in a simple contingency table, where the cells refer to discrete categories formed by combinations of the study sites. We present a Bayesian framework for fitting a suite of log-linear models to these data, with each model representing a different hypothesis about dependence between sites. Modeling dependence facilitates the analysis of opportunistic photo-identification data from study sites located due to convenience rather than by design. Because inference about population size is sensitive to model choice, we use Bayesian Markov chain Monte Carlo approaches to estimate posterior model probabilities, and base inference on a model-averaged estimate of population size. We demonstrate this method in the analysis of photographic mark-recapture data for bottlenose dolphins from three coastal sites around NE Scotland. 相似文献
14.
15.
Across multiply imputed data sets, variable selection methods such as stepwise regression and other criterion-based strategies that include or exclude particular variables typically result in models with different selected predictors, thus presenting a problem for combining the results from separate complete-data analyses. Here, drawing on a Bayesian framework, we propose two alternative strategies to address the problem of choosing among linear regression models when there are missing covariates. One approach, which we call \"impute, then select\" (ITS) involves initially performing multiple imputation and then applying Bayesian variable selection to the multiply imputed data sets. A second strategy is to conduct Bayesian variable selection and missing data imputation simultaneously within one Gibbs sampling process, which we call \"simultaneously impute and select\" (SIAS). The methods are implemented and evaluated using the Bayesian procedure known as stochastic search variable selection for multivariate normal data sets, but both strategies offer general frameworks within which different Bayesian variable selection algorithms could be used for other types of data sets. A study of mental health services utilization among children in foster care programs is used to illustrate the techniques. Simulation studies show that both ITS and SIAS outperform complete-case analysis with stepwise variable selection and that SIAS slightly outperforms ITS. 相似文献
16.
ABSTRACTA groundwater field is a complex and open system. Groundwater simulation and prediction often deviated from true values, which is attributed to the uncertainty of groundwater modeling. The conceptual model (model struture) is one of the main sources of groundwater modeling uncertianty. In this study, the mean Euclidean distance (MED) between model simulations and observations is proposed to assess the integrated likelihood value of a conceptual model in Bayesian model averaging (BMA). Moreover, this proposed BMA method is compared with the traditional generalized likelihood uncertainty estimation (GLUE) BMA method by a synthetical groundwater model, and the characteristics of these two BMA methods are summarized. 相似文献
17.
For over a decade, experimental evolution has been combined with high-throughput sequencing techniques. In so-called Evolve-and-Resequence (E&R) experiments, populations are kept in the laboratory under controlled experimental conditions where their genomes are sampled and allele frequencies monitored. However, identifying signatures of adaptation in E&R datasets is far from trivial, and it is still necessary to develop more efficient and statistically sound methods for detecting selection in genome-wide data. Here, we present Bait-ER – a fully Bayesian approach based on the Moran model of allele evolution to estimate selection coefficients from E&R experiments. The model has overlapping generations, a feature that describes several experimental designs found in the literature. We tested our method under several different demographic and experimental conditions to assess its accuracy and precision, and it performs well in most scenarios. Nevertheless, some care must be taken when analysing trajectories where drift largely dominates and starting frequencies are low. We compare our method with other available software and report that ours has generally high accuracy even for trajectories whose complexity goes beyond a classical sweep model. Furthermore, our approach avoids the computational burden of simulating an empirical null distribution, outperforming available software in terms of computational time and facilitating its use on genome-wide data. We implemented and released our method in a new open-source software package that can be accessed at https://doi.org/10.5281/zenodo.7351736 . 相似文献
18.
We present the first scientific study of white-shouldered ibis Pseudibis davisoni habitat preferences in dry dipterocarp forest. Foraging sites included seasonal pools, forest understorey grasslands and fallow rice fields, with terrestrial sites used more following rainfall. Habitat and anthropogenic effects in logistic models of foraging site selection were examined by multimodel inference and model averaging. White-shouldered ibis preferred pools with greater cover of short vegetation (<25 cm) and less of the boundary enclosed, and forest sites with greater cover of bare substrate and lower people encounter rate. At forest sites, livestock density was positively related to bare substrate extent and thus may improve suitability for foraging ibis. At pools, livestock removed tall vegetation between the early and late dry season indicating their importance in opening up foraging habitats after wet season growth. However, by the late dry season, pools with greater livestock density had less short vegetation, the habitat favoured by ibis. Conservation strategies for white-shouldered ibis must consider a range of habitats, not just seasonal wetlands, and should incorporate extensive grazing and associated burning practises of local communities. Further understanding of the effects of these practices on vegetation, prey abundance and prey availability are therefore needed for effective conservation of this species. This will also develop our understanding of potentially beneficial anthropogenic influences in tropical environments. 相似文献
19.
Summary The Joint United Nations Programme on HIV/AIDS (UNAIDS) has decided to use Bayesian melding as the basis for its probabilistic projections of HIV prevalence in countries with generalized epidemics. This combines a mechanistic epidemiological model, prevalence data, and expert opinion. Initially, the posterior distribution was approximated by sampling‐importance‐resampling, which is simple to implement, easy to interpret, transparent to users, and gave acceptable results for most countries. For some countries, however, this is not computationally efficient because the posterior distribution tends to be concentrated around nonlinear ridges and can also be multimodal. We propose instead incremental mixture importance sampling (IMIS), which iteratively builds up a better importance sampling function. This retains the simplicity and transparency of sampling importance resampling, but is much more efficient computationally. It also leads to a simple estimator of the integrated likelihood that is the basis for Bayesian model comparison and model averaging. In simulation experiments and on real data, it outperformed both sampling importance resampling and three publicly available generic Markov chain Monte Carlo algorithms for this kind of problem. 相似文献
20.
Often there is substantial uncertainty in the selection of confounderswhen estimating the association between an exposure and health.We define this type of uncertainty as `adjustment uncertainty'.We propose a general statistical framework for handling adjustmentuncertainty in exposure effect estimation for a large numberof confounders, we describe a specific implementation, and wedevelop associated visualization tools. Theoretical resultsand simulation studies show that the proposed method providesconsistent estimators of the exposure effect and its variance.We also show that, when the goal is to estimate an exposureeffect accounting for adjustment uncertainty, Bayesian modelaveraging with posterior model probabilities approximated usinginformation criteria can fail to estimate the exposure effectand can over- or underestimate its variance. We compare ourapproach to Bayesian model averaging using time series dataon levels of fine particulate matter and mortality. 相似文献