首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Efficient Bayesian inference for Gaussian copula regression models   总被引:4,自引:0,他引:4  
  相似文献   

2.
    
This article is concerned with the Bayesian estimation of stochastic rate constants in the context of dynamic models of intracellular processes. The underlying discrete stochastic kinetic model is replaced by a diffusion approximation (or stochastic differential equation approach) where a white noise term models stochastic behavior and the model is identified using equispaced time course data. The estimation framework involves the introduction of m- 1 latent data points between every pair of observations. MCMC methods are then used to sample the posterior distribution of the latent process and the model parameters. The methodology is applied to the estimation of parameters in a prokaryotic autoregulatory gene network.  相似文献   

3.
A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio test, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and Bayes factors. Current implementations of any of these criteria suffer from the limitation that only a small set of models are examined, or that the test does not allow easy comparison of non-nested models. In this article, we expand the pool of candidate substitution models to include all possible time-reversible models. This set includes seven models that have already been described. We show how Bayes factors can be calculated for these models using reversible jump Markov chain Monte Carlo, and apply the method to 16 DNA sequence alignments. For each data set, we compare the model with the best Bayes factor to the best models chosen using AIC and BIC. We find that the best model under any of these criteria is not necessarily the most complicated one; models with an intermediate number of substitution types typically do best. Moreover, almost all of the models that are chosen as best do not constrain a transition rate to be the same as a transversion rate, suggesting that it is the transition/transversion rate bias that plays the largest role in determining which models are selected. Importantly, the reversible jump Markov chain Monte Carlo algorithm described here allows estimation of phylogeny (and other phylogenetic model parameters) to be performed while accounting for uncertainty in the model of DNA substitution.  相似文献   

4.
    
Ecological studies aim to analyse the variation of disease risk in relation to exposure variables that are measured at an area unit level. In practice it is rarely possible to use the exposure variables themselves, either because the corresponding data are not available or because the causes of the disease are not fully understood. It is therefore quite common to use crude proxies of the real exposure to the disease in question. These proxies are rarely able to explain the disease variation and hence additional area level random effects are introduced to account for the residual variation. In this paper we investigate the possibility to model the effect of ecological covariates non‐parametrically, with and without additional random effects for the residual spatial variation. We illustrate the issues arising through analyses of simulated and real data on larynx cancer mortality in Germany, during the years of 1986 to 1990, where we use the corresponding lung cancer rates as a proxy for smoking consumption.  相似文献   

5.
In protein-coding DNA sequences, historical patterns of selection can be inferred from amino acid substitution patterns. High relative rates of nonsynonymous to synonymous changes (=d N /d S ) are a clear indicator of positive, or directional, selection, and several recently developed methods attempt to distinguish these sites from those under neutral or purifying selection. One method uses an empirical Bayesian framework that accounts for varying selective pressures across sites while conditioning on the parameters of the model of DNA evolution and on the phylogenetic history. We describe a method that identifies sites under diversifying selection using a fully Bayesian framework. Similar to earlier work, the method presented here allows the rate of nonsynonymous to synonymous changes to vary among sites. The significant difference in using a fully Bayesian approach lies in our ability to account for uncertainty in parameters including the tree topology, branch lengths, and the codon model of DNA substitution. We demonstrate the utility of the fully Bayesian approach by applying our method to a data set of the vertebrate -globin gene. Compared to a previous analysis of this data set, the hierarchical model found most of the same sites to be in the positive selection class, but with a few striking exceptions.  相似文献   

6.
    
A hierarchical Bayesian regression model is fitted to longitudinal data on Haemophilus influenzae type b (Hib) serum antibodies. To estimate the decline rate of the antibody concentration, the model accommodates the possibility of unobserved subclinical infections with Hib bacteria that cause increasing concentrations during the study period. The computations rely on Markov chain Monte Carlo simulation of the joint posterior distribution of the model parameters. The model is used to predict the duration of immunity to subclinical Hib infection and to a serious invasive Hib disease.  相似文献   

7.
An increased availability of genotypes at marker loci has prompted the development of models that include the effect of individual genes. Selection based on these models is known as marker-assisted selection (MAS). MAS is known to be efficient especially for traits that have low heritability and non-additive gene action. BLUP methodology under non-additive gene action is not feasible for large inbred or crossbred pedigrees. It is easy to incorporate non-additive gene action in a finite locus model. Under such a model, the unobservable genotypic values can be predicted using the conditional mean of the genotypic values given the data. To compute this conditional mean, conditional genotype probabilities must be computed. In this study these probabilities were computed using iterative peeling, and three Markov chain Monte Carlo (MCMC) methods – scalar Gibbs, blocking Gibbs, and a sampler that combines the Elston Stewart algorithm with iterative peeling (ESIP). The performance of these four methods was assessed using simulated data. For pedigrees with loops, iterative peeling fails to provide accurate genotype probability estimates for some pedigree members. Also, computing time is exponentially related to the number of loci in the model. For MCMC methods, a linear relationship can be maintained by sampling genotypes one locus at a time. Out of the three MCMC methods considered, ESIP, performed the best while scalar Gibbs performed the worst.  相似文献   

8.
One barrier to interpreting the observational evidence concerning the adverse health effects of air pollution for public policy purposes is the measurement error inherent in estimates of exposure based on ambient pollutant monitors. Exposure assessment studies have shown that data from monitors at central sites may not adequately represent personal exposure. Thus, the exposure error resulting from using centrally measured data as a surrogate for personal exposure can potentially lead to a bias in estimates of the health effects of air pollution. This paper develops a multi-stage Poisson regression model for evaluating the effects of exposure measurement error on estimates of effects of particulate air pollution on mortality in time-series studies. To implement the model, we have used five validation data sets on personal exposure to PM10. Our goal is to combine data on the associations between ambient concentrations of particulate matter and mortality for a specific location, with the validation data on the association between ambient and personal concentrations of particulate matter at the locations where data have been collected. We use these data in a model to estimate the relative risk of mortality associated with estimated personal-exposure concentrations and make a comparison with the risk of mortality estimated with measurements of ambient concentration alone. We apply this method to data comprising daily mortality counts, ambient concentrations of PM10measured at a central site, and temperature for Baltimore, Maryland from 1987 to 1994. We have selected our home city of Baltimore to illustrate the method; the measurement error correction model is general and can be applied to other appropriate locations.Our approach uses a combination of: (1) a generalized additive model with log link and Poisson error for the mortality-personal-exposure association; (2) a multi-stage linear model to estimate the variability across the five validation data sets in the personal-ambient-exposure association; (3) data augmentation methods to address the uncertainty resulting from the missing personal exposure time series in Baltimore. In the Poisson regression model, we account for smooth seasonal and annual trends in mortality using smoothing splines. Taking into account the heterogeneity across locations in the personal-ambient-exposure relationship, we quantify the degree to which the exposure measurement error biases the results toward the null hypothesis of no effect, and estimate the loss of precision in the estimated health effects due to indirectly estimating personal exposures from ambient measurements.  相似文献   

9.
    
Kozumi H 《Biometrics》2000,56(4):1002-1006
This paper considers the discrete survival data from a Bayesian point of view. A sequence of the baseline hazard functions, which plays an important role in the discrete hazard function, is modeled with a hidden Markov chain. It is explained how the resultant model is implemented via Markov chain Monte Carlo methods. The model is illustrated by an application of real data.  相似文献   

10.
11.
    
For a finite locus model, Markov chain Monte Carlo (MCMC) methods can be used to estimate the conditional mean of genotypic values given phenotypes, which is also known as the best predictor (BP). When computationally feasible, this type of genetic prediction provides an elegant solution to the problem of genetic evaluation under non-additive inheritance, especially for crossbred data. Successful application of MCMC methods for genetic evaluation using finite locus models depends, among other factors, on the number of loci assumed in the model. The effect of the assumed number of loci on evaluations obtained by BP was investigated using data simulated with about 100 loci. For several small pedigrees, genetic evaluations obtained by best linear prediction (BLP) were compared to genetic evaluations obtained by BP. For BLP evaluation, used here as the standard of comparison, only the first and second moments of the joint distribution of the genotypic and phenotypic values must be known. These moments were calculated from the gene frequencies and genotypic effects used in the simulation model. BP evaluation requires the complete distribution to be known. For each model used for BP evaluation, the gene frequencies and genotypic effects, which completely specify the required distribution, were derived such that the genotypic mean, the additive variance, and the dominance variance were the same as in the simulation model. For lowly heritable traits, evaluations obtained by BP under models with up to three loci closely matched the evaluations obtained by BLP for both purebred and crossbred data. For highly heritable traits, models with up to six loci were needed to match the evaluations obtained by BLP.  相似文献   

12.
    
King R  Brooks SP  Coulson T 《Biometrics》2008,64(4):1187-1195
SUMMARY: We consider the issue of analyzing complex ecological data in the presence of covariate information and model uncertainty. Several issues can arise when analyzing such data, not least the need to take into account where there are missing covariate values. This is most acutely observed in the presence of time-varying covariates. We consider mark-recapture-recovery data, where the corresponding recapture probabilities are less than unity, so that individuals are not always observed at each capture event. This often leads to a large amount of missing time-varying individual covariate information, because the covariate cannot usually be recorded if an individual is not observed. In addition, we address the problem of model selection over these covariates with missing data. We consider a Bayesian approach, where we are able to deal with large amounts of missing data, by essentially treating the missing values as auxiliary variables. This approach also allows a quantitative comparison of different models via posterior model probabilities, obtained via the reversible jump Markov chain Monte Carlo algorithm. To demonstrate this approach we analyze data relating to Soay sheep, which pose several statistical challenges in fully describing the intricacies of the system.  相似文献   

13.
  总被引:2,自引:0,他引:2  
King R  Brooks SP 《Biometrics》2008,64(3):816-824
Summary .   We consider the estimation of the size of a closed population, often of interest for wild animal populations, using a capture–recapture study. The estimate of the total population size can be very sensitive to the choice of model used to fit to the data. We consider a Bayesian approach, in which we consider all eight plausible models initially described by Otis et al. (1978, Wildlife Monographs 62, 1–135) within a single framework, including models containing an individual heterogeneity component. We show how we are able to obtain a model-averaged estimate of the total population, incorporating both parameter and model uncertainty. To illustrate the methodology we initially perform a simulation study and analyze two datasets where the population size is known, before considering a real example relating to a population of dolphins off northeast Scotland.  相似文献   

14.
This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian approach and sample the parameters of the model from the posterior distribution with Markov chain Monte Carlo, using a Metropolis-Hastings and Gibbs-within-Gibbs scheme. The proposed method is tested on various synthetic and real-world DNA sequence alignments, and we compare its performance with the established detection methods RECPARS, PLATO, and TOPAL, as well as with two alternative parameter estimation schemes.  相似文献   

15.
    
Pauler DK  Laird NM 《Biometrics》2000,56(2):464-472
In clinical trials of a self-administered drug, repeated measures of a laboratory marker, which is affected by study medication and collected in all treatment arms, can provide valuable information on population and individual summaries of compliance. In this paper, we introduce a general finite mixture of nonlinear hierarchical models that allows estimates of component membership probabilities and random effect distributions for longitudinal data arising from multiple subpopulations, such as from noncomplying and complying subgroups in clinical trials. We outline a sampling strategy for fitting these models, which consists of a sequence of Gibbs, Metropolis-Hastings, and reversible jump steps, where the latter is required for switching between component models of different dimensions. Our model is applied to identify noncomplying subjects in the placebo arm of a clinical trial assessing the effectiveness of zidovudine (AZT) in the treatment of patients with HIV, where noncompliance was defined as initiation of AZT during the trial without the investigators' knowledge. We fit a hierarchical nonlinear change-point model for increases in the marker MCV (mean corpuscular volume of erythrocytes) for subjects who noncomply and a constant mean random effects model for those who comply. As part of our fully Bayesian analysis, we assess the sensitivity of conclusions to prior and modeling assumptions and demonstrate how external information and covariates can be incorporated to distinguish subgroups.  相似文献   

16.
    
  相似文献   

17.
    
  相似文献   

18.
  总被引:1,自引:0,他引:1  
Inoue LY  Thall PF  Berry DA 《Biometrics》2002,58(4):823-831
A sequential Bayesian phase II/III design is proposed for comparative clinical trials. The design is based on both survival time and discrete early events that may be related to survival and assumes a parametric mixture model. Phase II involves a small number of centers. Patients are randomized between treatments throughout, and sequential decisions are based on predictive probabilities of concluding superiority of the experimental treatment. Whether to stop early, continue, or shift into phase III is assessed repeatedly in phase II. Phase III begins when additional institutions are incorporated into the ongoing phase II trial. Simulation studies in the context of a non-small-cell lung cancer trial indicate that the proposed method maintains overall size and power while usually requiring substantially smaller sample size and shorter trial duration when compared with conventional group-sequential phase III designs.  相似文献   

19.
Type-II ryanodine receptor channels (RYRs) play a fundamental role in intracellular Ca(2+) dynamics in heart. The processes of activation, inactivation, and regulation of these channels have been the subject of intensive research and the focus of recent debates. Typically, approaches to understand these processes involve statistical analysis of single RYRs, involving signal restoration, model estimation, and selection. These tasks are usually performed by following rather phenomenological criteria that turn models into self-fulfilling prophecies. Here, a thorough statistical treatment is applied by modeling single RYRs using aggregated hidden Markov models. Inferences are made using Bayesian statistics and stochastic search methods known as Markov chain Monte Carlo. These methods allow extension of the temporal resolution of the analysis far beyond the limits of previous approaches and provide a direct measure of the uncertainties associated with every estimation step, together with a direct assessment of why and where a particular model fails. Analyses of single RYRs at several Ca(2+) concentrations are made by considering 16 models, some of them previously reported in the literature. Results clearly show that single RYRs have Ca(2+)-dependent gating modes. Moreover, our results demonstrate that single RYRs responding to a sudden change in Ca(2+) display adaptation kinetics. Interestingly, best ranked models predict microscopic reversibility when monovalent cations are used as the main permeating species. Finally, the extended bandwidth revealed the existence of novel fast buzz-mode at low Ca(2+) concentrations.  相似文献   

20.
  总被引:2,自引:0,他引:2  
Summary .   In this article, we present new methods to analyze data from an experiment using rodent models to investigate the role of p27, an important cell-cycle mediator, in early colon carcinogenesis. The responses modeled here are essentially functions nested within a two-stage hierarchy. Standard functional data analysis literature focuses on a single stage of hierarchy and conditionally independent functions with near white noise. However, in our experiment, there is substantial biological motivation for the existence of spatial correlation among the functions, which arise from the locations of biological structures called colonic crypts: this possible functional correlation is a phenomenon we term crypt signaling . Thus, as a point of general methodology, we require an analysis that allows for functions to be correlated at the deepest level of the hierarchy. Our approach is fully Bayesian and uses Markov chain Monte Carlo methods for inference and estimation. Analysis of this data set gives new insights into the structure of p27 expression in early colon carcinogenesis and suggests the existence of significant crypt signaling. Our methodology uses regression splines, and because of the hierarchical nature of the data, dimension reduction of the covariance matrix of the spline coefficients is important: we suggest simple methods for overcoming this problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号