期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian Estimation of Positively Selected Sites

Huelsenbeck JP Dyer KA 《Journal of molecular evolution》2004,58(6):661-672

In protein-coding DNA sequences, historical patterns of selection can be inferred from amino acid substitution patterns. High relative rates of nonsynonymous to synonymous changes (=d _N/d _S) are a clear indicator of positive, or directional, selection, and several recently developed methods attempt to distinguish these sites from those under neutral or purifying selection. One method uses an empirical Bayesian framework that accounts for varying selective pressures across sites while conditioning on the parameters of the model of DNA evolution and on the phylogenetic history. We describe a method that identifies sites under diversifying selection using a fully Bayesian framework. Similar to earlier work, the method presented here allows the rate of nonsynonymous to synonymous changes to vary among sites. The significant difference in using a fully Bayesian approach lies in our ability to account for uncertainty in parameters including the tree topology, branch lengths, and the codon model of DNA substitution. We demonstrate the utility of the fully Bayesian approach by applying our method to a data set of the vertebrate -globin gene. Compared to a previous analysis of this data set, the hierarchical model found most of the same sites to be in the positive selection class, but with a few striking exceptions. 相似文献

2.

总被引：3，自引：0，他引：3

Ando Tomohiro 《Biometrika》2007,94(2):443-458

The problem of evaluating the goodness of the predictive distributionsof hierarchical Bayesian and empirical Bayes models is investigated.A Bayesian predictive information criterion is proposed as anestimator of the posterior mean of the expected loglikelihoodof the predictive distribution when the specified family ofprobability distributions does not contain the true distribution.The proposed criterion is developed by correcting the asymptoticbias of the posterior mean of the loglikelihood as an estimatorof its expected loglikelihood. In the evaluation of hierarchicalBayesian models with random effects, regardless of our parametricfocus, the proposed criterion considers the bias correctionof the posterior mean of the marginal loglikelihood becauseit requires a consistent parameter estimator. The use of thebootstrap in model evaluation is also discussed. 相似文献

3.

QTL mapping in outbred half-sib families using Bayesian model selection

Fang M Liu J Sun D Zhang Y Zhang Q Zhang Y Zhang S 《Heredity》2011,107(3):265-276

In this article, we propose a model selection method, the Bayesian composite model space approach, to map quantitative trait loci (QTL) in a half-sib population for continuous and binary traits. In our method, the identity-by-descent-based variance component model is used. To demonstrate the performance of this model, the method was applied to map QTL underlying production traits on BTA6 in a Chinese half-sib dairy cattle population. A total of four QTLs were detected, whereas only one QTL was identified using the traditional least square (LS) method. We also conducted two simulation experiments to validate the efficiency of our method. The results suggest that the proposed method based on a multiple-QTL model is efficient in mapping multiple QTL for an outbred half-sib population and is more powerful than the LS method based on a single-QTL model. 相似文献

4.

Marko J. Rinta‐aho Mikko J. Sillanp 《Biometrical journal. Biometrische Zeitschrift》2019,61(3):729-746

Stochastic search variable selection (SSVS) is a Bayesian variable selection method that employs covariate‐specific discrete indicator variables to select which covariates (e.g., molecular markers) are included in or excluded from the model. We present a new variant of SSVS where, instead of discrete indicator variables, we use continuous‐scale weighting variables (which take also values between zero and one) to select covariates into the model. The improved model performance is shown and compared to standard SSVS using simulated and real quantitative trait locus mapping datasets. The decision making to decide phenotype‐genotype associations in our SSVS variant is based on median of posterior distribution or using Bayes factors. We also show here that by using continuous‐scale weighting variables it is possible to improve mixing properties of Markov chain Monte Carlo sampling substantially compared to standard SSVS. Also, the separation of association signals and nonsignals (control of noise level) seems to be more efficient compared to the standard SSVS. Thus, the novel method provides efficient new framework for SSVS analysis that additionally provides whole posterior distribution for pseudo‐indicators which means more information and may help in decision making. 相似文献

5.

总被引：2，自引：0，他引：2

P. G. Blackwell 《Biometrics》2001,57(2):502-507

This article describes an inhomogeneous Poisson point process in the plane with an intensity function based on a Dirichlet tessellation process and a method for using observations on the point process to make fully Bayesian inferences about the underlying tessellation. The method is implemented using a Markov chain Monte Carlo approach. An application to modeling the territories of clans of badgers, Meles meles, is described. 相似文献

6.

Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms 总被引：6，自引：0，他引：6

ROBERTS G. O.; TWEEDIE R. L. 《Biometrika》1996,83(1):95-110

相似文献

7.

A tutorial introduction to Bayesian inference for stochastic epidemic models using Markov chain Monte Carlo methods 总被引：1，自引：0，他引：1

Philip D. O&#x;Neill 《Mathematical biosciences》2002,180(1-2)

Recent Bayesian methods for the analysis of infectious disease outbreak data using stochastic epidemic models are reviewed. These methods rely on Markov chain Monte Carlo methods. Both temporal and non-temporal data are considered. The methods are illustrated with a number of examples featuring different models and datasets. 相似文献

8.

A Bayesian approach to disease gene location using allelic association

Denham MC Whittaker JC 《Biostatistics (Oxford, England)》2003,4(3):399-409

A Bayesian approach to analysing data from family-based association studies is developed. This permits direct assessment of the range of possible values of model parameters, such as the recombination frequency and allelic associations, in the light of the data. In addition, sophisticated comparisons of different models may be handled easily, even when such models are not nested. The methodology is developed in such a way as to allow separate inferences to be made about linkage and association by including theta, the recombination fraction between the marker and disease susceptibility locus under study, explicitly in the model. The method is illustrated by application to a previously published data set. The data analysis raises some interesting issues, notably with regard to the weight of evidence necessary to convince us of linkage between a candidate locus and disease. 相似文献

9.

Adaptive sampling for Bayesian variable selection 总被引：1，自引：0，他引：1

Nott David J.; Kohn Robert 《Biometrika》2005,92(4):747-763

相似文献

10.

Ying Wang Bruce Rannala 《Genetics》2014,198(4):1621-1628

Recombination generates variation and facilitates evolution. Recombination (or lack thereof) also contributes to human genetic disease. Methods for mapping genes influencing complex genetic diseases via association rely on linkage disequilibrium (LD) in human populations, which is influenced by rates of recombination across the genome. Comparative population genomic analyses of recombination using related primate species can identify factors influencing rates of recombination in humans. Such studies can indicate how variable hotspots for recombination may be both among individuals (or populations) and over evolutionary timescales. Previous studies have suggested that locations of recombination hotspots are not conserved between humans and chimpanzees. We made use of the data sets from recent resequencing projects and applied a Bayesian method for identifying hotspots and estimating recombination rates. We also reanalyzed SNP data sets for regions with known hotspots in humans using samples from the human and chimpanzee. The Bayes factors (BF) of shared recombination hotspots between human and chimpanzee across regions were obtained. Based on the analysis of the aligned regions of human chromosome 21, locations where the two species show evidence of shared recombination hotspots (with high BFs) were identified. Interestingly, previous comparative studies of human and chimpanzee that focused on the known human recombination hotspots within the β-globin and HLA regions did not find overlapping of hotspots. Our results show high BFs of shared hotspots at locations within both regions, and the estimated locations of shared hotspots overlap with the locations of human recombination hotspots obtained from sperm-typing studies. 相似文献

11.

McDonald JW Smith PW Forster JJ 《Biometrics》1999,55(2):620-624

We propose Metropolis-Hastings sampling methods for estimating the exact conditional p-value for tests of goodness of fit of log-linear models for mortality rates and standardized mortality ratios. We focus on two-way tables, where the required conditional distribution is a multivariate noncentral hypergeometric distribution with known noncentrality parameter. Two examples are presented: a 2 x 3 table, where the exact results, obtained by enumeration, are available for comparison, and a 9 x 7 table, where Monte Carlo methods provide the only feasible approach for exact inference. 相似文献

12.

Peskun's theorem and a modified discrete-state Gibbs sampler 总被引：1，自引：0，他引：1

LIU JUN S. 《Biometrika》1996,83(3):681-682

相似文献

13.

Bayesian analysis of amino acid substitution models

Huelsenbeck JP Joyce P Lakner C Ronquist F 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2008,363(1512):3941-3953

Models of amino acid substitution present challenges beyond those often faced with the analysis of DNA sequences. The alignments of amino acid sequences are often small, whereas the number of parameters to be estimated is potentially large when compared with the number of free parameters for nucleotide substitution models. Most approaches to the analysis of amino acid alignments have focused on the use of fixed amino acid models in which all of the potentially free parameters are fixed to values estimated from a large number of sequences. Often, these fixed amino acid models are specific to a gene or taxonomic group (e.g. the Mtmam model, which has parameters that are specific to mammalian mitochondrial gene sequences). Although the fixed amino acid models succeed in reducing the number of free parameters to be estimated--indeed, they reduce the number of free parameters from approximately 200 to 0--it is possible that none of the currently available fixed amino acid models is appropriate for a specific alignment. Here, we present four approaches to the analysis of amino acid sequences. First, we explore the use of a general time reversible model of amino acid substitution using a Dirichlet prior probability distribution on the 190 exchangeability parameters. Second, we then explore the behaviour of prior probability distributions that are'centred' on the rates specified by the fixed amino acid model. Third, we consider a mixture of fixed amino acid models. Finally, we consider constraints on the exchangeability parameters as partitions,similar to how nucleotide substitution models are specified, and place a Dirichlet process prior model on all the possible partitioning schemes. 相似文献

14.

A general comparison of relaxed molecular clock models 总被引：4，自引：0，他引：4

Lepage T Bryant D Philippe H Lartillot N 《Molecular biology and evolution》2007,24(12):2669-2680

Several models have been proposed to relax the molecular clock in order to estimate divergence times. However, it is unclear which model has the best fit to real data and should therefore be used to perform molecular dating. In particular, we do not know whether rate autocorrelation should be considered or which prior on divergence times should be used. In this work, we propose a general bench mark of alternative relaxed clock models. We have reimplemented most of the already existing models, including the popular lognormal model, as well as various prior choices for divergence times (birth-death, Dirichlet, uniform), in a common Bayesian statistical framework. We also propose a new autocorrelated model, called the "CIR" process, with well-defined stationary properties. We assess the relative fitness of these models and priors, when applied to 3 different protein data sets from eukaryotes, vertebrates, and mammals, by computing Bayes factors using a numerical method called thermodynamic integration. We find that the 2 autocorrelated models, CIR and lognormal, have a similar fit and clearly outperform uncorrelated models on all 3 data sets. In contrast, the optimal choice for the divergence time prior is more dependent on the data investigated. Altogether, our results provide useful guidelines for model choice in the field of molecular dating while opening the way to more extensive model comparisons. 相似文献

15.

Swartz MD Kimmel M Mueller P Amos CI 《Biometrics》2006,62(2):495-503

Mapping the genes for a complex disease, such as diabetes or rheumatoid arthritis (RA), involves finding multiple genetic loci that may contribute to the onset of the disease. Pairwise testing of the loci leads to the problem of multiple testing. Looking at haplotypes, or linear sets of loci, avoids multiple tests but results in a contingency table with sparse counts, especially when using marker loci with multiple alleles. We propose a hierarchical Bayesian model for case-parent triad data that uses a conditional logistic regression likelihood to model the probability of transmission to a diseased child. We define hierarchical prior distributions on the allele main effects to model the genetic dependencies present in the human leukocyte antigen (HLA) region of chromosome 6. First, we add a hierarchical level for model selection that accounts for both locus and allele selection. This allows us to cast the problem of identifying genetic loci relevant to the disease into a problem of Bayesian variable selection. Second, we attempt to include linkage disequilibrium as a covariance structure in the prior for model coefficients. We evaluate the performance of the procedure with some simulated examples and then apply our procedure to identifying genetic markers in the HLA region that influence risk for RA. Our software is available on the website http://www.epigenetic.org/Linkage/ssgs-public/. 相似文献

16.

基于贝叶斯统计的遗传连锁分析方法

汤在祥王学枫吴雯雯徐辰武《遗传》2006,28(9):1117-1122

贝叶斯学派是不同于经典数理统计的一个重要学派, 其发展的贝叶斯统计方法在现代科学的许多领域已有着广泛的应用。探讨了贝叶斯统计在遗传连锁分析中的应用, 包括遗传重组率的贝叶斯估计、遗传连锁的贝叶斯因子检验和基于马尔可夫链蒙特卡罗理论的遗传连锁图谱构建。用编制的SAS/IML程序进行了模拟研究和实例分析, 验证了贝叶斯方法在遗传连锁分析中的有效性和实用性。相似文献

17.

Ghosh J Herring AH Siega-Riz AM 《Biometrics》2011,67(3):917-925

In this article, we develop a latent class model with class probabilities that depend on subject-specific covariates. One of our major goals is to identify important predictors of latent classes. We consider methodology that allows estimation of latent classes while allowing for variable selection uncertainty. We propose a Bayesian variable selection approach and implement a stochastic search Gibbs sampler for posterior computation to obtain model-averaged estimates of quantities of interest such as marginal inclusion probabilities of predictors. Our methods are illustrated through simulation studies and application to data on weight gain during pregnancy, where it is of interest to identify important predictors of latent weight gain classes. 相似文献

18.

Model choice in generalised linear models: A Bayesian approach via Kullback-Leibler projections

GOUTIS CONSTANTINOS; ROBERT CHRISTIAN P. 《Biometrika》1998,85(1):29-37

相似文献

19.

Linkage analysis of quantitative trait loci in multiple line crosses 总被引：8，自引：0，他引：8

Yi N Xu S 《Genetica》2002,114(3):217-230

Simple line crosses, for example, backcross and F₂, are commonly used in mapping quantitative trait loci (QTL). However, these simple crosses are rarely used alone in commercial plant breeding; rather, crosses involving multiple inbred lines or several simple crosses but connected by shared inbred lines may be common in plant breeding. Mapping QTL using crosses of multiple lines is more relevant to plant breeding. Unfortunately, current statistical methods and computer programs of QTL mapping are all designed for simple line crosses or multiple line crosses but under a regular mating system. It is not straightforward to extend the existing methods to handle multiple line crosses under irregular and complicated mating designs. The major hurdle comes from irregular inbreeding, multiple generations, and multiple alleles. In this study, we develop a Bayesian method implemented via the Markov chain Monte Carlo (MCMC) algorithm for mapping QTL using complicated multiple line crosses. With the MCMC algorithm, we are able to draw a complete path of the gene flow from founder alleles to their descendents via a recursive process. This has greatly simplified the problem caused by irregular mating and inbreeding in the mapping population. Adopting the reversible jump MCMC algorithm, we are able to simultaneously search for multiple QTL along the genome. We can even infer the posterior distribution of the number of QTL, one of the most important parameters in QTL study. Application of the new MCMC based QTL mapping procedure is demonstrated using two different mating designs. Design I involves two inbred lines and their derived F₁, F₂, and BC populations. Design II is a half-diallel cross involving three inbred lines. The two designs appear different, but can be handled with the same robust computer program. 相似文献

20.

Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo

Huelsenbeck JP Larget B Alfaro ME 《Molecular biology and evolution》2004,21(6):1123-1133

A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio test, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and Bayes factors. Current implementations of any of these criteria suffer from the limitation that only a small set of models are examined, or that the test does not allow easy comparison of non-nested models. In this article, we expand the pool of candidate substitution models to include all possible time-reversible models. This set includes seven models that have already been described. We show how Bayes factors can be calculated for these models using reversible jump Markov chain Monte Carlo, and apply the method to 16 DNA sequence alignments. For each data set, we compare the model with the best Bayes factor to the best models chosen using AIC and BIC. We find that the best model under any of these criteria is not necessarily the most complicated one; models with an intermediate number of substitution types typically do best. Moreover, almost all of the models that are chosen as best do not constrain a transition rate to be the same as a transversion rate, suggesting that it is the transition/transversion rate bias that plays the largest role in determining which models are selected. Importantly, the reversible jump Markov chain Monte Carlo algorithm described here allows estimation of phylogeny (and other phylogenetic model parameters) to be performed while accounting for uncertainty in the model of DNA substitution. 相似文献