共查询到20条相似文献,搜索用时 15 毫秒
1.
A Bayesian approach to analysing data from family-based association studies is developed. This permits direct assessment of the range of possible values of model parameters, such as the recombination frequency and allelic associations, in the light of the data. In addition, sophisticated comparisons of different models may be handled easily, even when such models are not nested. The methodology is developed in such a way as to allow separate inferences to be made about linkage and association by including theta, the recombination fraction between the marker and disease susceptibility locus under study, explicitly in the model. The method is illustrated by application to a previously published data set. The data analysis raises some interesting issues, notably with regard to the weight of evidence necessary to convince us of linkage between a candidate locus and disease. 相似文献
2.
A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio test, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and Bayes factors. Current implementations of any of these criteria suffer from the limitation that only a small set of models are examined, or that the test does not allow easy comparison of non-nested models. In this article, we expand the pool of candidate substitution models to include all possible time-reversible models. This set includes seven models that have already been described. We show how Bayes factors can be calculated for these models using reversible jump Markov chain Monte Carlo, and apply the method to 16 DNA sequence alignments. For each data set, we compare the model with the best Bayes factor to the best models chosen using AIC and BIC. We find that the best model under any of these criteria is not necessarily the most complicated one; models with an intermediate number of substitution types typically do best. Moreover, almost all of the models that are chosen as best do not constrain a transition rate to be the same as a transversion rate, suggesting that it is the transition/transversion rate bias that plays the largest role in determining which models are selected. Importantly, the reversible jump Markov chain Monte Carlo algorithm described here allows estimation of phylogeny (and other phylogenetic model parameters) to be performed while accounting for uncertainty in the model of DNA substitution. 相似文献
3.
Adaptive sampling for Bayesian variable selection 总被引:1,自引:0,他引:1
4.
5.
We present a statistical method, and its accompanying algorithms, for the selection of a mathematical model of the gating mechanism of an ion channel and for the estimation of the parameters of this model. The method assumes a hidden Markov model that incorporates filtering, colored noise and state-dependent white excess noise for the recorded data. The model selection and parameter estimation are performed via a Bayesian approach using Markov chain Monte Carlo. The method is illustrated by its application to single-channel recordings of the K+ outward-rectifier in barley leaf.Acknowledgement The authors thank Sake Vogelzang, Bert van Duijn and Bert de Boer for their helpful advice and useful comments and suggestions. 相似文献
6.
Introgression in admixed populations can be used to identify candidate loci that might underlie adaptation or reproductive isolation. The Bayesian genomic cline model provides a framework for quantifying variable introgression in admixed populations and identifying regions of the genome with extreme introgression that are potentially associated with variation in fitness. Here we describe the bgc software, which uses Markov chain Monte Carlo to estimate the joint posterior probability distribution of the parameters in the Bayesian genomic cline model and designate outlier loci. This software can be used with next‐generation sequence data, accounts for uncertainty in genotypic state, and can incorporate information from linked loci on a genetic map. Output from the analysis is written to an HDF5 file for efficient storage and manipulation. This software is written in C++ . The source code, software manual, compilation instructions and example data sets are available under the GNU Public License at http://sites.google.com/site/bgcsoftware/ . 相似文献
7.
In protein-coding DNA sequences, historical patterns of selection can be inferred from amino acid substitution patterns. High relative rates of nonsynonymous to synonymous changes (=d
N
/d
S
) are a clear indicator of positive, or directional, selection, and several recently developed methods attempt to distinguish these sites from those under neutral or purifying selection. One method uses an empirical Bayesian framework that accounts for varying selective pressures across sites while conditioning on the parameters of the model of DNA evolution and on the phylogenetic history. We describe a method that identifies sites under diversifying selection using a fully Bayesian framework. Similar to earlier work, the method presented here allows the rate of nonsynonymous to synonymous changes to vary among sites. The significant difference in using a fully Bayesian approach lies in our ability to account for uncertainty in parameters including the tree topology, branch lengths, and the codon model of DNA substitution. We demonstrate the utility of the fully Bayesian approach by applying our method to a data set of the vertebrate -globin gene. Compared to a previous analysis of this data set, the hierarchical model found most of the same sites to be in the positive selection class, but with a few striking exceptions. 相似文献
8.
In this article, we consider the problem of the estimation of quantitative trait loci (QTL), those chromosomal regions at which genetic information affecting some quantitative trait is encoded. Generally the number of such encoding sites is unknown, and associations between neutral molecular marker genotypes and observed trait phenotypes are sought to locate them. We consider a Bayesian model for simple experimental designs, and discuss the existing approaches to inference for this problem. In particular, we focus on locating positions of the best candidate markers segregating for the trait, a situation which is of primary interest in comparative mapping. We introduce a loss function for estimating both the number of QTL and their location, and we illustrate its application via simulated and real data. 相似文献
9.
Huelsenbeck JP Rannala B 《Evolution; international journal of organic evolution》2003,57(6):1237-1247
Abstract.— The importance of accommodating the phylogenetic history of a group when performing a comparative analysis is now widely recognized. The typical approaches either assume the tree is known without error, or they base inferences on a collection of well-supported trees or on a collection of trees generated under a stochastic model of cladogenesis. However, these approaches do not adequately account for the uncertainty of phylogenetic trees in a comparative analysis, especially when data relevant to the phylogeny of a group are available. Here, we develop a method for performing comparative analyses that is based on an extension of Felsenstein's independent contrasts method. Uncertainties in the phylogeny, branch lengths, and other parameters are accommodated by averaging over all possible trees, weighting each by the probability that the tree is correct. We do this in a Bayesian framework and use Markov chain Monte Carlo to perform the high-dimensional summations and integrations required by the analysis. We illustrate the method using comparative characters sampled from Anolis lizards. 相似文献
10.
In this paper we develop a Bayesian approach to parameter estimation in a stochastic spatio-temporal model of the spread of invasive species across a landscape. To date, statistical techniques, such as logistic and autologistic regression, have outstripped stochastic spatio-temporal models in their ability to handle large numbers of covariates. Here we seek to address this problem by making use of a range of covariates describing the bio-geographical features of the landscape. Relative to regression techniques, stochastic spatio-temporal models are more transparent in their representation of biological processes. They also explicitly model temporal change, and therefore do not require the assumption that the species' distribution (or other spatial pattern) has already reached equilibrium as is often the case with standard statistical approaches. In order to illustrate the use of such techniques we apply them to the analysis of data detailing the spread of an invasive plant, Heracleum mantegazzianum, across Britain in the 20th Century using geo-referenced covariate information describing local temperature, elevation and habitat type. The use of Markov chain Monte Carlo sampling within a Bayesian framework facilitates statistical assessments of differences in the suitability of different habitat classes for H. mantegazzianum, and enables predictions of future spread to account for parametric uncertainty and system variability. Our results show that ignoring such covariate information may lead to biased estimates of key processes and implausible predictions of future distributions. 相似文献
11.
12.
Summary . We consider the estimation of the size of a closed population, often of interest for wild animal populations, using a capture–recapture study. The estimate of the total population size can be very sensitive to the choice of model used to fit to the data. We consider a Bayesian approach, in which we consider all eight plausible models initially described by Otis et al. (1978, Wildlife Monographs 62, 1–135) within a single framework, including models containing an individual heterogeneity component. We show how we are able to obtain a model-averaged estimate of the total population, incorporating both parameter and model uncertainty. To illustrate the methodology we initially perform a simulation study and analyze two datasets where the population size is known, before considering a real example relating to a population of dolphins off northeast Scotland. 相似文献
13.
14.
This article is concerned with the Bayesian estimation of stochastic rate constants in the context of dynamic models of intracellular processes. The underlying discrete stochastic kinetic model is replaced by a diffusion approximation (or stochastic differential equation approach) where a white noise term models stochastic behavior and the model is identified using equispaced time course data. The estimation framework involves the introduction of m- 1 latent data points between every pair of observations. MCMC methods are then used to sample the posterior distribution of the latent process and the model parameters. The methodology is applied to the estimation of parameters in a prokaryotic autoregulatory gene network. 相似文献
15.
In this article, we propose a Bayesian approach to phase I/II dose-finding oncology trials by jointly modeling a binary toxicity outcome and a continuous biomarker expression outcome. We apply our method to a clinical trial of a new gene therapy for bladder cancer patients. In this trial, the biomarker expression indicates biological activity of the new therapy. For ethical reasons, the trial is conducted sequentially, with the dose for each successive patient chosen using both toxicity and activity data from patients previously treated in the trial. The modeling framework that we use naturally incorporates correlation between the binary toxicity and continuous activity outcome via a latent Gaussian variable. The dose-escalation/de-escalation decision rules are based on the posterior distributions of both toxicity and activity. A flexible state-space model is used to relate the activity outcome and dose. Extensive simulation studies show that the design reliably chooses the preferred dose using both toxicity and expression outcomes under various clinical scenarios. 相似文献
16.
Model-based estimation of the human health risks resulting from exposure to environmental contaminants can be an important tool for structuring public health policy. Due to uncertainties in the modeling process, the outcomes of these assessments are usually probabilistic representations of a range of possible risks. In some cases, health surveillance data are available for the assessment population over all or a subset of the risk projection period and this additional information can be used to augment the model-based estimates. We use a Bayesian approach to update model-based estimates of health risks based on available health outcome data. Updated uncertainty distributions for risk estimates are derived using Monte Carlo sampling, which allows flexibility to model realistic situations including measurement error in the observable outcomes. We illustrate the approach by using imperfect public health surveillance data on lung cancer deaths to update model-based lung cancer mortality risk estimates in a population exposed to ionizing radiation from a uranium processing facility. 相似文献
17.
A yearlong study was conducted to determine factors that affect the abundance and distribution of lysogens and free viruses at fresh-, brackish-, and saltwater stations in Newport Bay, CA. The viral and bacterial abundance were highest in the freshwater (average 1.1 × 108 and 1.1 × 107 mL−1 , respectively) and lowest in the marine water (average 0.4 × 108 and 0.5 × 107 mL−1 , respectively). Bacterial and viral counts were also several times higher during the summer than in winter. Approximately, 35% of the 141 samples were inducible in the presence of mitomycin C. The highest percentage of inducible lysogens was observed in marine waters (42%), while the lowest percentage was observed in the warmer freshwater (23%). A statistical model for the joint occurrence of lysogens and free viruses was formulated and estimated using Bayesian techniques to understand the key environmental determinants of viruses and lysogens. Our results support the existence of significant heterogeneity between the saltwater and freshwater sites. A parsimonious model that combines the two saltwater sites performs best among the specifications that were considered. Bacteria and water temperature were significant determinants of virus counts, whereas lysogen relationships are unclear. Importantly, conditional on the covariates, viruses and lysogen fractions exhibit robust negative correlation. 相似文献
18.
In silico model‐based inference: A contemporary approach for hypothesis testing in network biology
下载免费PDF全文

David J. Klinke II 《Biotechnology progress》2014,30(6):1247-1261
Inductive inference plays a central role in the study of biological systems where one aims to increase their understanding of the system by reasoning backwards from uncertain observations to identify causal relationships among components of the system. These causal relationships are postulated from prior knowledge as a hypothesis or simply a model. Experiments are designed to test the model. Inferential statistics are used to establish a level of confidence in how well our postulated model explains the acquired data. This iterative process, commonly referred to as the scientific method, either improves our confidence in a model or suggests that we revisit our prior knowledge to develop a new model. Advances in technology impact how we use prior knowledge and data to formulate models of biological networks and how we observe cellular behavior. However, the approach for model‐based inference has remained largely unchanged since Fisher, Neyman and Pearson developed the ideas in the early 1900s that gave rise to what is now known as classical statistical hypothesis (model) testing. Here, I will summarize conventional methods for model‐based inference and suggest a contemporary approach to aid in our quest to discover how cells dynamically interpret and transmit information for therapeutic aims that integrates ideas drawn from high performance computing, Bayesian statistics, and chemical kinetics. © 2014 American Institute of Chemical Engineers Biotechnol. Prog., 30:1247–1261, 2014 相似文献
19.
Zheng Li Vernon M. Chinchilli Ming Wang 《Biometrical journal. Biometrische Zeitschrift》2019,61(1):187-202
Recurrent events could be stopped by a terminal event, which commonly occurs in biomedical and clinical studies. In this situation, dependent censoring is encountered because of potential dependence between these two event processes, leading to invalid inference if analyzing recurrent events alone. The joint frailty model is one of the widely used approaches to jointly model these two processes by sharing the same frailty term. One important assumption is that recurrent and terminal event processes are conditionally independent given the subject‐level frailty; however, this could be violated when the dependency may also depend on time‐varying covariates across recurrences. Furthermore, marginal correlation between two event processes based on traditional frailty modeling has no closed form solution for estimation with vague interpretation. In order to fill these gaps, we propose a novel joint frailty‐copula approach to model recurrent events and a terminal event with relaxed assumptions. Metropolis–Hastings within the Gibbs Sampler algorithm is used for parameter estimation. Extensive simulation studies are conducted to evaluate the efficiency, robustness, and predictive performance of our proposal. The simulation results show that compared with the joint frailty model, the bias and mean squared error of the proposal is smaller when the conditional independence assumption is violated. Finally, we apply our method into a real example extracted from the MarketScan database to study the association between recurrent strokes and mortality. 相似文献
20.
Cohen's kappa coefficient is a widely popular measure for chance-corrected nominal scale agreement between two raters. This article describes Bayesian analysis for kappa that can be routinely implemented using Markov chain Monte Carlo (MCMC) methodology. We consider the case of m > or = 2 independent samples of measured agreement, where in each sample a given subject is rated by two rating protocols on a binary scale. A major focus here is on testing the homogeneity of the kappa coefficient across the different samples. The existing frequentist tests for this case assume exchangeability of rating protocols, whereas our proposed Bayesian test does not make any such assumption. Extensive simulation is carried out to compare the performances of the Bayesian and the frequentist tests. The developed methodology is illustrated using data from a clinical trial in ophthalmology. 相似文献