共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Huelsenbeck JP Joyce P Lakner C Ronquist F 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2008,363(1512):3941-3953
Models of amino acid substitution present challenges beyond those often faced with the analysis of DNA sequences. The alignments of amino acid sequences are often small, whereas the number of parameters to be estimated is potentially large when compared with the number of free parameters for nucleotide substitution models. Most approaches to the analysis of amino acid alignments have focused on the use of fixed amino acid models in which all of the potentially free parameters are fixed to values estimated from a large number of sequences. Often, these fixed amino acid models are specific to a gene or taxonomic group (e.g. the Mtmam model, which has parameters that are specific to mammalian mitochondrial gene sequences). Although the fixed amino acid models succeed in reducing the number of free parameters to be estimated--indeed, they reduce the number of free parameters from approximately 200 to 0--it is possible that none of the currently available fixed amino acid models is appropriate for a specific alignment. Here, we present four approaches to the analysis of amino acid sequences. First, we explore the use of a general time reversible model of amino acid substitution using a Dirichlet prior probability distribution on the 190 exchangeability parameters. Second, we then explore the behaviour of prior probability distributions that are'centred' on the rates specified by the fixed amino acid model. Third, we consider a mixture of fixed amino acid models. Finally, we consider constraints on the exchangeability parameters as partitions,similar to how nucleotide substitution models are specified, and place a Dirichlet process prior model on all the possible partitioning schemes. 相似文献
3.
4.
Methodology for joint mapping of quantitative trait loci (QTL) affecting continuous and binary characters in experimental
crosses is presented. The procedure consists of a Bayesian Gaussian-threshold model implemented via Markov chain Monte Carlo,
which bypasses bottlenecks due to high-dimensional integrals required in maximum likelihood approaches. The method handles
multiple binary traits and multiple QTL. Modeling of ordered categorical traits is discussed as well. Features of the method
are illustrated using simulated datasets representing a backcross design, and the data are analyzed using mixed-trait and
single-trait models. The mixed-trait analysis provides greater detection power of a QTL than a single-trait analysis when
the QTL affects two or more traits. The number of QTL inferred in the mixed-trait analysis does not pertain to a specific
trait, but the roles of each QTL on specific traits can be assessed from estimates of its effects. The impacts of varying
incidence level and sample size on the mixed-trait QTL mapping analysis are investigated as well. 相似文献
5.
A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio test, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and Bayes factors. Current implementations of any of these criteria suffer from the limitation that only a small set of models are examined, or that the test does not allow easy comparison of non-nested models. In this article, we expand the pool of candidate substitution models to include all possible time-reversible models. This set includes seven models that have already been described. We show how Bayes factors can be calculated for these models using reversible jump Markov chain Monte Carlo, and apply the method to 16 DNA sequence alignments. For each data set, we compare the model with the best Bayes factor to the best models chosen using AIC and BIC. We find that the best model under any of these criteria is not necessarily the most complicated one; models with an intermediate number of substitution types typically do best. Moreover, almost all of the models that are chosen as best do not constrain a transition rate to be the same as a transversion rate, suggesting that it is the transition/transversion rate bias that plays the largest role in determining which models are selected. Importantly, the reversible jump Markov chain Monte Carlo algorithm described here allows estimation of phylogeny (and other phylogenetic model parameters) to be performed while accounting for uncertainty in the model of DNA substitution. 相似文献
6.
This paper presents a Bayesian analysis of a time series of counts to assess its dependence on an explanatory variable. The time series represented is the incidence of the infectious disease ESBL-producing Klebsiella pneumoniae in an Australian hospital and the explanatory variable is the number of grams of antibiotic (third generation) cephalosporin used during that time. We demonstrate that there is a statistically significant relationship between disease occurrence and use of the antibiotic, lagged by three months. The model used is a parameter-driven model in the form of a generalized linear mixed model. Comparison of models is made in terms of mean square error. 相似文献
7.
Monte Carlo methods have received much attention in the recent literature of phylogeny analysis. However, the conventional Markov chain Monte Carlo algorithms, such as the Metropolis–Hastings algorithm, tend to get trapped in a local mode in simulating from the posterior distribution of phylogenetic trees, rendering the inference ineffective. In this paper, we apply an advanced Monte Carlo algorithm, the stochastic approximation Monte Carlo algorithm, to Bayesian phylogeny analysis. Our method is compared with two popular Bayesian phylogeny software, BAMBE and MrBayes, on simulated and real datasets. The numerical results indicate that our method outperforms BAMBE and MrBayes. Among the three methods, SAMC produces the consensus trees which have the highest similarity to the true trees, and the model parameter estimates which have the smallest mean square errors, but costs the least CPU time. 相似文献
8.
Adaptive sampling for Bayesian variable selection 总被引:1,自引:0,他引:1
9.
Bayesian Nonparametric Nonproportional Hazards Survival Modeling 总被引:1,自引:0,他引:1
Summary . We develop a dependent Dirichlet process model for survival analysis data. A major feature of the proposed approach is that there is no necessity for resulting survival curve estimates to satisfy the ubiquitous proportional hazards assumption. An illustration based on a cancer clinical trial is given, where survival probabilities for times early in the study are estimated to be lower for those on a high-dose treatment regimen than for those on the low dose treatment, while the reverse is true for later times, possibly due to the toxic effect of the high dose for those who are not as healthy at the beginning of the study. 相似文献
10.
A Bayesian approach to analysing data from family-based association studies is developed. This permits direct assessment of the range of possible values of model parameters, such as the recombination frequency and allelic associations, in the light of the data. In addition, sophisticated comparisons of different models may be handled easily, even when such models are not nested. The methodology is developed in such a way as to allow separate inferences to be made about linkage and association by including theta, the recombination fraction between the marker and disease susceptibility locus under study, explicitly in the model. The method is illustrated by application to a previously published data set. The data analysis raises some interesting issues, notably with regard to the weight of evidence necessary to convince us of linkage between a candidate locus and disease. 相似文献
11.
Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models 总被引:3,自引:0,他引:3
The problem of evaluating the goodness of the predictive distributionsof hierarchical Bayesian and empirical Bayes models is investigated.A Bayesian predictive information criterion is proposed as anestimator of the posterior mean of the expected loglikelihoodof the predictive distribution when the specified family ofprobability distributions does not contain the true distribution.The proposed criterion is developed by correcting the asymptoticbias of the posterior mean of the loglikelihood as an estimatorof its expected loglikelihood. In the evaluation of hierarchicalBayesian models with random effects, regardless of our parametricfocus, the proposed criterion considers the bias correctionof the posterior mean of the marginal loglikelihood becauseit requires a consistent parameter estimator. The use of thebootstrap in model evaluation is also discussed. 相似文献
12.
In protein-coding DNA sequences, historical patterns of selection can be inferred from amino acid substitution patterns. High relative rates of nonsynonymous to synonymous changes (=d
N
/d
S
) are a clear indicator of positive, or directional, selection, and several recently developed methods attempt to distinguish these sites from those under neutral or purifying selection. One method uses an empirical Bayesian framework that accounts for varying selective pressures across sites while conditioning on the parameters of the model of DNA evolution and on the phylogenetic history. We describe a method that identifies sites under diversifying selection using a fully Bayesian framework. Similar to earlier work, the method presented here allows the rate of nonsynonymous to synonymous changes to vary among sites. The significant difference in using a fully Bayesian approach lies in our ability to account for uncertainty in parameters including the tree topology, branch lengths, and the codon model of DNA substitution. We demonstrate the utility of the fully Bayesian approach by applying our method to a data set of the vertebrate -globin gene. Compared to a previous analysis of this data set, the hierarchical model found most of the same sites to be in the positive selection class, but with a few striking exceptions. 相似文献
13.
Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability 总被引:16,自引:0,他引:16
Douady CJ Delsuc F Boucher Y Doolittle WF Douzery EJ 《Molecular biology and evolution》2003,20(2):248-254
Owing to the exponential growth of genome databases, phylogenetic trees are now widely used to test a variety of evolutionary hypotheses. Nevertheless, computation time burden limits the application of methods such as maximum likelihood nonparametric bootstrap to assess reliability of evolutionary trees. As an alternative, the much faster Bayesian inference of phylogeny, which expresses branch support as posterior probabilities, has been introduced. However, marked discrepancies exist between nonparametric bootstrap proportions and Bayesian posterior probabilities, leading to difficulties in the interpretation of sometimes strongly conflicting results. As an attempt to reconcile these two indices of node reliability, we apply the nonparametric bootstrap resampling procedure to the Bayesian approach. The correlation between posterior probabilities, bootstrap maximum likelihood percentages, and bootstrapped posterior probabilities was studied for eight highly diverse empirical data sets and were also investigated using experimental simulation. Our results show that the relation between posterior probabilities and bootstrapped maximum likelihood percentages is highly variable but that very strong correlations always exist when Bayesian node support is estimated on bootstrapped character matrices. Moreover, simulations corroborate empirical observations in suggesting that, being more conservative, the bootstrap approach might be less prone to strongly supporting a false phylogenetic hypothesis. Thus, apparent conflicts in topology recovered by the Bayesian approach were reduced after bootstrapping. Both posterior probabilities and bootstrap supports are of great interest to phylogeny as potential upper and lower bounds of node reliability, but they are surely not interchangeable and cannot be directly compared. 相似文献
14.
On the Bayesian analysis of ring-recovery data 总被引:5,自引:0,他引:5
Vounatsou and Smith (1995, Biometrics 51, 687-708) describe the modern Bayesian analysis of ring-recovery data. Here we discuss and extend their work. We draw different conclusions from two major data analyses. We emphasize the extreme sensitivity of certain parameter estimates to the choice of prior distribution and conclude that naive use of Bayesian methods in this area can be misleading. Additionally, we explain the discrepancy between the Bayesian and classical analyses when the likelihood surface has a flat ridge. In this case, when there is no unique maximum likelihood estimate, the Bayesian estimators are remarkably precise. 相似文献
15.
Logistic disease incidence models and case-control studies 总被引:8,自引:0,他引:8
16.
Johnson TD 《Biometrics》2003,59(3):650-660
Many hormones are secreted into the circulatory system in a pulsatile manner and are cleared exponentially. The most common method of analyzing these systems is to deconvolve the hormone concentration into a secretion function and a clearance function. Accurate estimation of the model parameters depends on the number and location of the secretion pulses. To date, deconvolution analysis assumes the number and approximate location of these pulses are known a priori. In this article, we present a novel Bayesian approach to deconvolution that jointly models the number of pulses along with all other model parameters. Our method stochastically searches for the secretion pulses. This is accomplished by viewing the set of parameters that define the pulses as a point process. Pulses are determined by a birth-death process which is embedded in Markov chain Monte Carlo algorithm. This idea originated with Stephens (2000, Annals of Statistics 28, 40-74) in the context of finite mixture model density estimation, where the number of mixture components is unknown. There are several advantages that our model enjoys over the traditional frequentist approaches. These advantages are highlighted with four datasets consisting of serum concentration levels of luteinizing hormone obtained from ovariectomized ewes. 相似文献
17.
Model-based estimation of the human health risks resulting from exposure to environmental contaminants can be an important tool for structuring public health policy. Due to uncertainties in the modeling process, the outcomes of these assessments are usually probabilistic representations of a range of possible risks. In some cases, health surveillance data are available for the assessment population over all or a subset of the risk projection period and this additional information can be used to augment the model-based estimates. We use a Bayesian approach to update model-based estimates of health risks based on available health outcome data. Updated uncertainty distributions for risk estimates are derived using Monte Carlo sampling, which allows flexibility to model realistic situations including measurement error in the observable outcomes. We illustrate the approach by using imperfect public health surveillance data on lung cancer deaths to update model-based lung cancer mortality risk estimates in a population exposed to ionizing radiation from a uranium processing facility. 相似文献
18.
19.
Kozumi H 《Biometrics》2000,56(4):1002-1006
This paper considers the discrete survival data from a Bayesian point of view. A sequence of the baseline hazard functions, which plays an important role in the discrete hazard function, is modeled with a hidden Markov chain. It is explained how the resultant model is implemented via Markov chain Monte Carlo methods. The model is illustrated by an application of real data. 相似文献
20.
Cohen's kappa coefficient is a widely popular measure for chance-corrected nominal scale agreement between two raters. This article describes Bayesian analysis for kappa that can be routinely implemented using Markov chain Monte Carlo (MCMC) methodology. We consider the case of m > or = 2 independent samples of measured agreement, where in each sample a given subject is rated by two rating protocols on a binary scale. A major focus here is on testing the homogeneity of the kappa coefficient across the different samples. The existing frequentist tests for this case assume exchangeability of rating protocols, whereas our proposed Bayesian test does not make any such assumption. Extensive simulation is carried out to compare the performances of the Bayesian and the frequentist tests. The developed methodology is illustrated using data from a clinical trial in ophthalmology. 相似文献