首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
    
Exposure to air pollution is associated with increased morbidity and mortality. Recent technological advancements permit the collection of time-resolved personal exposure data. Such data are often incomplete with missing observations and exposures below the limit of detection, which limit their use in health effects studies. In this paper, we develop an infinite hidden Markov model for multiple asynchronous multivariate time series with missing data. Our model is designed to include covariates that can inform transitions among hidden states. We implement beam sampling, a combination of slice sampling and dynamic programming, to sample the hidden states, and a Bayesian multiple imputation algorithm to impute missing data. In simulation studies, our model excels in estimating hidden states and state-specific means and imputing observations that are missing at random or below the limit of detection. We validate our imputation approach on data from the Fort Collins Commuter Study. We show that the estimated hidden states improve imputations for data that are missing at random compared to existing approaches. In a case study of the Fort Collins Commuter Study, we describe the inferential gains obtained from our model including improved imputation of missing data and the ability to identify shared patterns in activity and exposure among repeated sampling days for individuals and among distinct individuals.  相似文献   

2.
    
  相似文献   

3.
4.
    
Little is known about the human intra‐individual metabolic profile changes over an extended period of time. Here, we introduce a novel concept suggesting that children even at a very young age can be categorized in terms of metabolic state as they advance in development. The hidden Markov models were used as a method for discovering the underlying progression in the metabolic state. We applied the methodology to study metabolic trajectories in children between birth and 4 years of age, based on a series of samples selected from a large birth cohort study. We found multiple previously unknown age‐ and gender‐related metabolome changes of potential medical significance. Specifically, we found that the major developmental state differences between girls and boys are attributed to sphingolipids. In addition, we demonstrated the feasibility of state‐based alignment of personal metabolic trajectories. We show that children have different development rates at the level of metabolome and thus the state‐based approach may be advantageous when applying metabolome profiling in search of markers for subtle (patho)physiological changes.  相似文献   

5.
    
Johnson DS  Hoeting JA 《Biometrics》2003,59(2):341-350
In this article, we incorporate an autoregressive time-series framework into models for animal survival using capture-recapture data. Researchers modeling animal survival probabilities as the realization of a random process have typically considered survival to be independent from one time period to the next. This may not be realistic for some populations. Using a Gibbs sampling approach, we can estimate covariate coefficients and autoregressive parameters for survival models. The procedure is illustrated with a waterfowl band recovery dataset for northern pintails (Anas acuta). The analysis shows that the second lag autoregressive coefficient is significantly less than 0, suggesting that there is a triennial relationship between survival probabilities and emphasizing that modeling survival rates as independent random variables may be unrealistic in some cases. Software to implement the methodology is available at no charge on the Internet.  相似文献   

6.
    
Background: The recently emerged technology of methylated RNA immunoprecipitation sequencing (MeRIP-seq) sheds light on the study of RNA epigenetics. This new bioinformatics question calls for effective and robust peaking calling algorithms to detect mRNA methylation sites from MeRIP-seq data. Methods: We propose a Bayesian hierarchical model to detect methylation sites from MeRIP-seq data. Our modeling approach includes several important characteristics. First, it models the zero-inflated and over-dispersed counts by deploying a zero-inflated negative binomial model. Second, it incorporates a hidden Markov model (HMM) to account for the spatial dependency of neighboring read enrichment. Third, our Bayesian inference allows the proposed model to borrow strength in parameter estimation, which greatly improves the model stability when dealing with MeRIP-seq data with a small number of replicates. We use Markov chain Monte Carlo (MCMC) algorithms to simultaneously infer the model parameters in a de novo fashion. The R Shiny demo is available at the authors' website and the R/C++ code is available at https://github.com/liqiwei2000/BaySeqPeak. Results: In simulation studies, the proposed method outperformed the competing methods exomePeak and MeTPeak, especially when an excess of zeros were present in the data. In real MeRIP-seq data analysis, the proposed method identified methylation sites that were more consistent with biological knowledge, and had better spatial resolution compared to the other methods. Conclusions: In this study, we develop a Bayesian hierarchical model to identify methylation peaks in MeRIP-seq data. The proposed method has a competitive edge over existing methods in terms of accuracy, robustness and spatial resolution.  相似文献   

7.
Array-based comparative genomic hybridization (array-CGH) is a high throughput, high resolution technique for studying the genetics of cancer. Analysis of array-CGH data typically involves estimation of the underlying chromosome copy numbers from the log fluorescence ratios and segmenting the chromosome into regions with the same copy number at each location. We propose for the analysis of array-CGH data, a new stochastic segmentation model and an associated estimation procedure that has attractive statistical and computational properties. An important benefit of this Bayesian segmentation model is that it yields explicit formulas for posterior means, which can be used to estimate the signal directly without performing segmentation. Other quantities relating to the posterior distribution that are useful for providing confidence assessments of any given segmentation can also be estimated by using our method. We propose an approximation method whose computation time is linear in sequence length which makes our method practically applicable to the new higher density arrays. Simulation studies and applications to real array-CGH data illustrate the advantages of the proposed approach.  相似文献   

8.
    
Patrick LeBlanc  Li Ma 《Biometrics》2023,79(3):2321-2332
Mixed-membership (MM) models such as latent Dirichlet allocation (LDA) have been applied to microbiome compositional data to identify latent subcommunities of microbial species. These subcommunities are informative for understanding the biological interplay of microbes and for predicting health outcomes. However, microbiome compositions typically display substantial cross-sample heterogeneities in subcommunity compositions—that is, the variability in the proportions of microbes in shared subcommunities across samples—which is not accounted for in prior analyses. As a result, LDA can produce inference, which is highly sensitive to the specification of the number of subcommunities and often divides a single subcommunity into multiple artificial ones. To address this limitation, we incorporate the logistic-tree normal (LTN) model into LDA to form a new MM model. This model allows cross-sample variation in the composition of each subcommunity around some “centroid” composition that defines the subcommunity. Incorporation of auxiliary Pólya-Gamma variables enables a computationally efficient collapsed blocked Gibbs sampler to carry out Bayesian inference under this model. By accounting for such heterogeneity, our new model restores the robustness of the inference in the specification of the number of subcommunities and allows meaningful subcommunities to be identified.  相似文献   

9.
    
Tieming Ji 《Biometrics》2019,75(2):663-673
  相似文献   

10.
Type-II ryanodine receptor channels (RYRs) play a fundamental role in intracellular Ca(2+) dynamics in heart. The processes of activation, inactivation, and regulation of these channels have been the subject of intensive research and the focus of recent debates. Typically, approaches to understand these processes involve statistical analysis of single RYRs, involving signal restoration, model estimation, and selection. These tasks are usually performed by following rather phenomenological criteria that turn models into self-fulfilling prophecies. Here, a thorough statistical treatment is applied by modeling single RYRs using aggregated hidden Markov models. Inferences are made using Bayesian statistics and stochastic search methods known as Markov chain Monte Carlo. These methods allow extension of the temporal resolution of the analysis far beyond the limits of previous approaches and provide a direct measure of the uncertainties associated with every estimation step, together with a direct assessment of why and where a particular model fails. Analyses of single RYRs at several Ca(2+) concentrations are made by considering 16 models, some of them previously reported in the literature. Results clearly show that single RYRs have Ca(2+)-dependent gating modes. Moreover, our results demonstrate that single RYRs responding to a sudden change in Ca(2+) display adaptation kinetics. Interestingly, best ranked models predict microscopic reversibility when monovalent cations are used as the main permeating species. Finally, the extended bandwidth revealed the existence of novel fast buzz-mode at low Ca(2+) concentrations.  相似文献   

11.
  总被引:1,自引:0,他引:1  
Hodges JS  Carlin BP  Fan Q 《Biometrics》2003,59(2):317-322
Bayesian analyses of spatial data often use a conditionally autoregressive (CAR) prior, which can be written as the kernel of an improper density that depends on a precision parameter tau that is typically unknown. To include tau in the Bayesian analysis, the kernel must be multiplied by tau(k) for some k. This article rigorously derives k = (n - I)/2 for the L2 norm CAR prior (also called a Gaussian Markov random field model) and k = n - I for the L1 norm CAR prior, where n is the number of regions and I the number of \"islands\" (disconnected groups of regions) in the spatial map. Since I = 1 for a spatial structure defining a connected graph, this supports Knorr-Held's (2002, in Highly Structured Stochastic Systems, 260-264) suggestion that k = (n - 1)/2 in the L2 norm case, instead of the more common k = n/2. We illustrate the practical significance of our results using a periodontal example.  相似文献   

12.
  总被引:1,自引:0,他引:1  
Thompson WK  Rosen O 《Biometrics》2008,64(1):54-63
Summary.   We propose a method for analyzing data which consist of curves on multiple individuals, i.e., longitudinal or functional data. We use a Bayesian model where curves are expressed as linear combinations of B-splines with random coefficients. The curves are estimated as posterior means obtained via Markov chain Monte Carlo (MCMC) methods, which automatically select the local level of smoothing. The method is applicable to situations where curves are sampled sparsely and/or at irregular time points. We construct posterior credible intervals for the mean curve and for the individual curves. This methodology provides unified, efficient, and flexible means for smoothing functional data.  相似文献   

13.
    
Daniel R. Kowal 《Biometrics》2019,75(4):1321-1333
Measles presents a unique and imminent challenge for epidemiologists and public health officials: the disease is highly contagious, yet vaccination rates are declining precipitously in many localities. Consequently, the risk of a measles outbreak continues to rise. To improve preparedness, we study historical measles data both prevaccine and postvaccine, and design new methodology to forecast measles counts with uncertainty quantification. We propose to model the disease counts as an integer‐valued functional time series: measles counts are a function of time‐of‐year and time‐ordered by year. The counts are modeled using a negative‐binomial distribution conditional on a real‐valued latent process, which accounts for the overdispersion observed in the data. The latent process is decomposed using an unknown basis expansion, which is learned from the data, with dynamic basis coefficients. The resulting framework provides enhanced capability to model complex seasonality, which varies dynamically from year‐to‐year, and offers improved multimonth‐ahead point forecasts and substantially tighter forecast intervals (with correct coverage) compared to existing forecasting models. Importantly, the fully Bayesian approach provides well‐calibrated and precise uncertainty quantification for epi‐relevant features, such as the future value and time of the peak measles count in a given year. An R package is available online.  相似文献   

14.
    
  相似文献   

15.
    
The main goal of this paper is to investigate a cure rate model that comprehends some well‐known proposals found in the literature. In our work the number of competing causes of the event of interest follows the negative binomial distribution. The model is conveniently reparametrized through the cured fraction, which is then linked to covariates by means of the logistic link. We explore the use of Markov chain Monte Carlo methods to develop a Bayesian analysis in the proposed model. The procedure is illustrated with a numerical example.  相似文献   

16.
    
Summary .  Many hormones are secreted in pulses. The pulsatile relationship between hormones regulates many biological processes. To understand endocrine system regulation, time series of hormone concentrations are collected. The goal is to characterize pulsatile patterns and associations between hormones. Currently each hormone on each subject is fitted univariately. This leads to estimates of the number of pulses and estimates of the amount of hormone secreted; however, when the signal-to-noise ratio is small, pulse detection and parameter estimation remains difficult with existing approaches. In this article, we present a bivariate deconvolution model of pulsatile hormone data focusing on incorporating pulsatile associations. Through simulation, we exhibit that using the underlying pulsatile association between two hormones improves the estimation of the number of pulses and the other parameters defining each hormone. We develop the one-to-one, driver–response case and show how birth–death Markov chain Monte Carlo can be used for estimation. We exhibit these features through a simulation study and apply the method to luteinizing and follicle stimulating hormones.  相似文献   

17.
18.
Optimizing the effect of management practices on weed population dynamics is challenging due to the difficulties in inferring demographic parameters in seed banks and their response to disturbance. Here, we used a long‐term plant survey between 2006 and 2012 in 46 French vineyards and quantified the effects of management practices (tillage, mowing, and herbicide) on colonization, germination, and seed survival of 30 weed species in relation to their seed mass. To do so, we used a recent statistical approach to reliably estimate demographic parameters for plant populations with a seed bank using time series of presence–absence data, which we extended to account for interspecies variation in the effects of management practices on demographic parameters. Our main finding was that when the level of disturbance increased (i.e., in plots with a higher number of herbicides, tillage, or mowing treatments), colonization success and survival in large‐seeded species increased faster than in small‐seeded species. High disturbance through tillage increased survival in the seed bank of species with high seed mass. The application of herbicides increased germination, survival, and colonization probabilities of species with high seed mass. Mowing, representing habitats more competitive for light, increased the survival of species with high seed mass. Overall, the strong relationships between the effects of management practices and seed mass provide an indicator for predicting the dynamics of weed communities under disturbance.  相似文献   

19.
A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio test, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and Bayes factors. Current implementations of any of these criteria suffer from the limitation that only a small set of models are examined, or that the test does not allow easy comparison of non-nested models. In this article, we expand the pool of candidate substitution models to include all possible time-reversible models. This set includes seven models that have already been described. We show how Bayes factors can be calculated for these models using reversible jump Markov chain Monte Carlo, and apply the method to 16 DNA sequence alignments. For each data set, we compare the model with the best Bayes factor to the best models chosen using AIC and BIC. We find that the best model under any of these criteria is not necessarily the most complicated one; models with an intermediate number of substitution types typically do best. Moreover, almost all of the models that are chosen as best do not constrain a transition rate to be the same as a transversion rate, suggesting that it is the transition/transversion rate bias that plays the largest role in determining which models are selected. Importantly, the reversible jump Markov chain Monte Carlo algorithm described here allows estimation of phylogeny (and other phylogenetic model parameters) to be performed while accounting for uncertainty in the model of DNA substitution.  相似文献   

20.
    
Sun L  Clayton MK 《Biometrics》2008,64(1):74-84
Summary .   We address the development of methods for analyzing crossclassified categorical data that are spatially autocorrelated. We first extend the autologistic model to accommodate two variables. Two bivariate autologistic models are constructed, namely a two-step model and a symmetric model. Importance sampling is used to approximate the complex normalizing factors that arise in these models, and Markov chain Monte Carlo techniques are used to generate simulations of posterior distributions. The resulting models then are expanded to accommodate trend surfaces and directional effects. Simulation studies and real data are used to illustrate this method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号