首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Continuous-time, multistate processes can be used to represent a variety of biological processes in the public health sciences; yet the analysis of such processes is complex when they are observed only at a limited number of time points. Inference methods for such panel data have been developed for time homogeneous Markov models, but there has been little research done for other classes of processes. We develop likelihood-based methods for panel data from a semi-Markov process, where transition intensities depend on the duration of time in the current state. The proposed methods account for possible misclassification of states. To illustrate the methods, we investigate a three- and a four-state models in detail and apply the results to model the natural history of oncogenic genital human papillomavirus infections in women.  相似文献   

2.
In longitudinal studies where time to a final event is the ultimate outcome often information is available about intermediate events the individuals may experience during the observation period. Even though many extensions of the Cox proportional hazards model have been proposed to model such multivariate time-to-event data these approaches are still very rarely applied to real datasets. The aim of this paper is to illustrate the application of extended Cox models for multiple time-to-event data and to show their implementation in popular statistical software packages. We demonstrate a systematic way of jointly modelling similar or repeated transitions in follow-up data by analysing an event-history dataset consisting of 270 breast cancer patients, that were followed-up for different clinical events during treatment in metastatic disease. First, we show how this methodology can also be applied to non Markovian stochastic processes by representing these processes as "conditional" Markov processes. Secondly, we compare the application of different Cox-related approaches to the breast cancer data by varying their key model components (i.e. analysis time scale, risk set and baseline hazard function). Our study showed that extended Cox models are a powerful tool for analysing complex event history datasets since the approach can address many dynamic data features such as multiple time scales, dynamic risk sets, time-varying covariates, transition by covariate interactions, autoregressive dependence or intra-subject correlation.  相似文献   

3.
MOTIVATION: Many biomedical and clinical research problems involve discovering causal relationships between observations gathered from temporal events. Dynamic Bayesian networks are a powerful modeling approach to describe causal or apparently causal relationships, and support complex medical inference, such as future response prediction, automated learning, and rational decision making. Although many engines exist for creating Bayesian networks, most require a local installation and significant data manipulation to be practical for a general biologist or clinician. No software pipeline currently exists for interpretation and inference of dynamic Bayesian networks learned from biomedical and clinical data. RESULTS: miniTUBA is a web-based modeling system that allows clinical and biomedical researchers to perform complex medical/clinical inference and prediction using dynamic Bayesian network analysis with temporal datasets. The software allows users to choose different analysis parameters (e.g. Markov lags and prior topology), and continuously update their data and refine their results. miniTUBA can make temporal predictions to suggest interventions based on an automated learning process pipeline using all data provided. Preliminary tests using synthetic data and laboratory research data indicate that miniTUBA accurately identifies regulatory network structures from temporal data. AVAILABILITY: miniTUBA is available at http://www.minituba.org.  相似文献   

4.
ABSTRACT: BACKGROUND: A number of software packages are available to generate DNA multiple sequence alignments (MSAs) evolved under continuous-time Markov processes on phylogenetic trees. On the other hand, methods of simulating the DNA MSA directly from the transition matrices do not exist. Moreover, existing software restricts to the time-reversible models and it is not optimized to generate nonhomogeneous data (i.e. placing distinct substitution rates at different lineages). RESULTS: We present the first package designed to generate MSAs evolving under discrete-time Markov processes on phylogenetic trees, directly from probability substitution matrices. Based on the input model and a phylogenetic tree in the Newick format (with branch lengths measured as the expected number of substitutions per site), the algorithm produces DNA alignments of desired length. GenNon-h is publicly available for download. CONCLUSION: The software presented here is an efficient tool to generate DNA MSAs on a given phylogenetic tree. GenNon-h provides the user with the nonstationary or nonhomogeneous phylogenetic data that is well suited for testing complex biological hypotheses, exploring the limits of the reconstruction algorithms and their robustness to such models.  相似文献   

5.
Semi-Markov and modulated renewal processes provide a large class of multi-state models which can be used for analysis of longitudinal failure time data. In biomedical applications, models of this kind are often used to describe evolution of a disease and assume that patient may move among a finite number of states representing different phases in the disease progression. Several authors proposed extensions of the proportional hazard model for regression analysis of these processes. In this paper, we consider a general class of censored semi-Markov and modulated renewal processes and propose use of transformation models for their analysis. Special cases include modulated renewal processes with interarrival times specified using transformation models, and semi-Markov processes with with one-step transition probabilities defined using copula-transformation models. We discuss estimation of finite and infinite dimensional parameters and develop an extension of the Gaussian multiplier method for setting confidence bands for transition probabilities and related parameters. A transplant outcome data set from the Center for International Blood and Marrow Transplant Research is used for illustrative purposes.  相似文献   

6.
Hidden Markov models were successfully applied in various fields of time series analysis, especially for analyzing ion channel recordings. The maximum likelihood estimator (MLE) has recently been proven to be asymptotically normally distributed. Here, we investigate finite sample properties of the MLE and of different types of likelihood ratio tests (LRTs) by means of simulation studies. The MLE is shown to reach the asymptotic behavior within sample sizes that are common for various applications. Thus, reliable estimates and confidence intervals can be obtained. We give an approximative scaling function for the estimation error for finite samples, and investigate the power of different LRTs suitable for applications to ion channels, including tests for superimposed hidden Markov processes. Our results are applied to physiological sodium channel data.  相似文献   

7.
The development of mobile-health technology has the potential to revolutionize personalized medicine. Biomedical sensors (e.g., wearables) can assist with determining treatment plans for individuals, provide quantitative information to healthcare providers, and give objective measurements of health, leading to the goal of precise phenotypic correlates for genotypes. Even though treatments and interventions are becoming more specific and datasets more abundant, measuring the causal impact of health interventions requires careful considerations of complex covariate structures, as well as knowledge of the temporal and spatial properties of the data. Thus, interpreting biomedical sensor data needs to make use of specialized statistical models. Here, we show how the Bayesian structural time series framework, widely used in economics, can be applied to these data. This framework corrects for covariates to provide accurate assessments of the significance of interventions. Furthermore, it allows for a time-dependent confidence interval of impact, which is useful for considering individualized assessments of intervention efficacy. We provide a customized biomedical adaptor tool, MhealthCI, around a specific implementation of the Bayesian structural time series framework that uniformly processes, prepares, and registers diverse biomedical data. We apply the software implementation of MhealthCI to a structured set of examples in biomedicine to showcase the ability of the framework to evaluate interventions with varying levels of data richness and covariate complexity and also compare the performance to other models. Specifically, we show how the framework is able to evaluate an exercise intervention’s effect on stabilizing blood glucose in a diabetes dataset. We also provide a future-anticipating illustration from a behavioral dataset showcasing how the framework integrates complex spatial covariates. Overall, we show the robustness of the Bayesian structural time series framework when applied to biomedical sensor data, highlighting its increasing value for current and future datasets.  相似文献   

8.
MSMBuilder is a software package for building statistical models of high-dimensional time-series data. It is designed with a particular focus on the analysis of atomistic simulations of biomolecular dynamics such as protein folding and conformational change. MSMBuilder is named for its ability to construct Markov state models (MSMs), a class of models that has gained favor among computational biophysicists. In addition to both well-established and newer MSM methods, the package includes complementary algorithms for understanding time-series data such as hidden Markov models and time-structure based independent component analysis. MSMBuilder boasts an easy to use command-line interface, as well as clear and consistent abstractions through its Python application programming interface. MSMBuilder was developed with careful consideration for compatibility with the broader machine learning community by following the design of scikit-learn. The package is used primarily by practitioners of molecular dynamics, but is just as applicable to other computational or experimental time-series measurements.  相似文献   

9.
Biomedical trials often give rise to data having the form of time series of a common process on separate individuals. One model which has been proposed to explain variations in such series across individuals is a random effects model based on sample periodograms. The use of spectral coefficients enables models for individual series to be constructed on the basis of standard asymptotic theory, whilst variations between individuals are handled by permitting a random effect perturbation of model coefficients. This paper extends such methodology in two ways: first, by enabling a nonparametric specification of underlying spectral behaviour; second, by addressing some of the tricky computational issues which are encountered when working with this class of random effect models. This leads to a model in which a population spectrum is specified nonparametrically through a dynamic system, and the processes measured on individuals within the population are assumed to have a spectrum which has a random effect perturbation from the population norm. Simulation studies show that standard MCMC algorithms give effective inferences for this model, and applications to biomedical data suggest that the model itself is capable of revealing scientifically important structure in temporal characteristics both within and between individual processes.  相似文献   

10.
Hypothesis generation in observational, biomedical data science often starts with computing an association or identifying the statistical relationship between a dependent and an independent variable. However, the outcome of this process depends fundamentally on modeling strategy, with differing strategies generating what can be called “vibration of effects” (VoE). VoE is defined by variation in associations that often lead to contradictory results. Here, we present a computational tool capable of modeling VoE in biomedical data by fitting millions of different models and comparing their output. We execute a VoE analysis on a series of widely reported associations (e.g., carrot intake associated with eyesight) with an extended additional focus on lifestyle exposures (e.g., physical activity) and components of the Framingham Risk Score for cardiovascular health (e.g., blood pressure). We leveraged our tool for potential confounder identification, investigating what adjusting variables are responsible for conflicting models. We propose modeling VoE as a critical step in navigating discovery in observational data, discerning robust associations, and cataloging adjusting variables that impact model output.

COVID positivity and vitamin D intake, red meat and heart disease; how can we discern when biomedical associations are reliable and when they are susceptible to our own arbitrary choices and assumptions? This study presents “quantvoe,” a software package for exploring the entirety of possible findings due to the multiverse of associations possible.  相似文献   

11.
Cook RJ  Zeng L  Lee KA 《Biometrics》2008,64(4):1100-1109
SUMMARY: Interval-censored life-history data arise when the events of interest are only detectable at periodic assessments. When interest lies in the occurrence of two such events, bivariate-interval censored event time data are obtained. We describe how to fit a four-state Markov model useful for characterizing the association between two interval-censored event times when the assessment times for the two events may be generated by different inspection processes. The approach treats the two events symmetrically and enables one to fit multiplicative intensity models that give estimates of covariate effects as well as relative risks characterizing the association between the two events. An expectation-maximization (EM) algorithm is described for estimation in which the maximization step can be carried out with standard software. The method is illustrated by application to data from a trial of HIV patients where the events are the onset of viral shedding in the blood and urine among individuals infected with cytomegalovirus.  相似文献   

12.

Background

Estimating the reduction in levels of infection during implementation of soil-transmitted helminth (STH) control programmes is important to measure their performance and to plan interventions. Markov modelling techniques have been used with some success to predict changes in STH prevalence following treatment in Viet Nam. The model is stationary and to date, the prediction has been obtained by calculating the transition probabilities between the different classes of intensity following the first year of drug distribution and assuming that these remain constant in subsequent years. However, to run this model longitudinal parasitological data (including intensity of infection) are required for two consecutive years from at least 200 individuals. Since this amount of data is not often available from STH control programmes, the possible application of the model in control programme is limited. The present study aimed to address this issue by adapting the existing Markov model to allow its application when a more limited amount of data is available and to test the predictive capacities of these simplified models.

Method

We analysed data from field studies conducted with different combination of three parameters: (i) the frequency of drug administration; (ii) the drug distributed; and (iii) the target treatment population (entire population or school-aged children only). This analysis allowed us to define 10 sets of standard transition probabilities to be used to predict prevalence changes when only baseline data are available (simplified model 1). We also formulated three equations (one for each STH parasite) to calculate the predicted prevalence of the different classes of intensity from the total prevalence. These equations allowed us to design a simplified model (SM2) to obtain predictions when the classes of intensity at baseline were not known. To evaluate the performance of the simplified models, we collected data from the scientific literature on changes in STH prevalence during the implementation of 26 control programmes in 16 countries. Using the baseline data observed, we applied the simplified models and predicted the onward prevalence of STH infection at each time-point for which programme data were available. We then compared the output from the model with the observed data from the programme.

Results

The comparison between the model-predicted prevalence and the observed values demonstrated a good accuracy of the predictions. In 77% of cases the original model predicted a prevalence within five absolute percentage points from the observed figure, for the simplified model one in 69% of cases and for the simplified model two in 60% of cases. We consider that the STH Markov model described here could be an important tool for programme managers to monitor the progress of their control programmes and to select the appropriate intervention. We also developed, and made freely available online, a software tool to enable the use of the STH Markov model by personnel with limited knowledge of mathematical models.  相似文献   

13.
Time-series data resulting from surveying wild animals are often described using state-space population dynamics models, in particular with Gompertz, Beverton-Holt, or Moran-Ricker latent processes. We show how hidden Markov model methodology provides a flexible framework for fitting a wide range of models to such data. This general approach makes it possible to model abundance on the natural or log scale, include multiple observations at each sampling occasion and compare alternative models using information criteria. It also easily accommodates unequal sampling time intervals, should that possibility occur, and allows testing for density dependence using the bootstrap. The paper is illustrated by replicated time series of red kangaroo abundances, and a univariate time series of ibex counts which are an order of magnitude larger. In the analyses carried out, we fit different latent process and observation models using the hidden Markov framework. Results are robust with regard to the necessary discretization of the state variable. We find no effective difference between the three latent models of the paper in terms of maximized likelihood value for the two applications presented, and also others analyzed. Simulations suggest that ecological time series are not sufficiently informative to distinguish between alternative latent processes for modeling population survey data when data do not indicate strong density dependence.  相似文献   

14.
Nonequilibrium response spectroscopy (NRS) has been proposed recently to complement standard electrophysiological techniques used to investigate ion channels. It involves application of rapidly oscillating potentials that drive the ion channel ensemble far from equilibrium. It is argued that new, so far undiscovered features of ion channel gating kinetics may become apparent under such nonequilibrium conditions. In this paper we explore the possibility of using regular, sinusoidal voltages with the NRS protocols to facilitate Markov model selection for ion channels. As a test case we consider the Shaker potassium channel for which various Markov models have been proposed recently. We concentrate on certain classes of such models and show that while some models might be virtually indistinguishable using standard methods, they show marked differences when driven with an oscillating voltage. Model currents are compared to experimental data obtained for the Shaker K+ channel expressed in mammalian cells (tsA 201).  相似文献   

15.
Özgür Sahin 《FEBS letters》2009,583(11):1766-1771
Substantial progress in functional genomic and proteomic technologies has opened new perspectives in biomedical research. The sequence of the human genome has been mostly determined and opened new visions on its complexity and regulation. New technologies, like RNAi and protein arrays, allow gathering knowledge beyond single gene analysis. Increasingly, biological processes are studied with systems biological approaches, where qualitative and quantitative data of the components are utilized to model the respective processes, to predict effects of perturbations, and to then refine these models after experimental testing. Here, we describe the potential of applying functional genomics and proteomics, taking the ERBB family of growth-factor receptors as an example to study the signaling network and its impact on cancer.  相似文献   

16.
MOTIVATION: Bayesian analysis is one of the most popular methods in phylogenetic inference. The most commonly used methods fix a single multiple alignment and consider only substitutions as phylogenetically informative mutations, though alignments and phylogenies should be inferred jointly as insertions and deletions also carry informative signals. Methods addressing these issues have been developed only recently and there has not been so far a user-friendly program with a graphical interface that implements these methods. RESULTS: We have developed an extendable software package in the Java programming language that samples from the joint posterior distribution of phylogenies, alignments and evolutionary parameters by applying the Markov chain Monte Carlo method. The package also offers tools for efficient on-the-fly summarization of the results. It has a graphical interface to configure, start and supervise the analysis, to track the status of the Markov chain and to save the results. The background model for insertions and deletions can be combined with any substitution model. It is easy to add new substitution models to the software package as plugins. The samples from the Markov chain can be summarized in several ways, and new postprocessing plugins may also be installed.  相似文献   

17.
The most commonly used models for analysing local dependencies in DNA sequences are (high-order) Markov chains. Incorporating knowledge relative to the possible grouping of the nucleotides enables to define dedicated sub-classes of Markov chains. The problem of formulating lumpability hypotheses for a Markov chain is therefore addressed. In the classical approach to lumpability, this problem can be formulated as the determination of an appropriate state space (smaller than the original state space) such that the lumped chain defined on this state space retains the Markov property. We propose a different perspective on lumpability where the state space is fixed and the partitioning of this state space is represented by a one-to-many probabilistic function within a two-level stochastic process. Three nested classes of lumped processes can be defined in this way as sub-classes of first-order Markov chains. These lumped processes enable parsimonious reparameterizations of Markov chains that help to reveal relevant partitions of the state space. Characterizations of the lumped processes on the original transition probability matrix are derived. Different model selection methods relying either on hypothesis testing or on penalized log-likelihood criteria are presented as well as extensions to lumped processes constructed from high-order Markov chains. The relevance of the proposed approach to lumpability is illustrated by the analysis of DNA sequences. In particular, the use of lumped processes enables to highlight differences between intronic sequences and gene untranslated region sequences.  相似文献   

18.
The ability to measure the properties of proteins at the single-molecule level offers an unparalleled glimpse into biological systems at the molecular scale. The interpretation of single-molecule time series has often been rooted in statistical mechanics and the theory of Markov processes. While existing analysis methods have been useful, they are not without significant limitations including problems of model selection and parameter nonidentifiability. To address these challenges, we introduce the use of nonparametric Bayesian inference for the analysis of single-molecule time series. These methods provide a flexible way to extract structure from data instead of assuming models beforehand. We demonstrate these methods with applications to several diverse settings in single-molecule biophysics. This approach provides a well-constrained and rigorously grounded method for determining the number of biophysical states underlying single-molecule data.  相似文献   

19.
Identifying risk factors for transition rates among normal cognition, mildly cognitive impairment, dementia and death in an Alzheimer's disease study is very important. It is known that transition rates among these states are strongly time dependent. While Markov process models are often used to describe these disease progressions, the literature mainly focuses on time homogeneous processes, and limited tools are available for dealing with non-homogeneity. Further, patients may choose when they want to visit the clinics, which creates informative observations. In this paper, we develop methods to deal with non-homogeneous Markov processes through time scale transformation when observation times are pre-planned with some observations missing. Maximum likelihood estimation via the EM algorithm is derived for parameter estimation. Simulation studies demonstrate that the proposed method works well under a variety of situations. An application to the Alzheimer's disease study identifies that there is a significant increase in transition rates as a function of time. Furthermore, our models reveal that the non-ignorable missing mechanism is perhaps reasonable.  相似文献   

20.
Continuous-time Markov processes are often used to model the complex natural phenomenon of sequence evolution. To make the process of sequence evolution tractable, simplifying assumptions are often made about the sequence properties and the underlying process. The validity of one such assumption, time-homogeneity, has never been explored. Violations of this assumption can be found by identifying non-embeddability. A process is non-embeddable if it can not be embedded in a continuous time-homogeneous Markov process. In this study, non-embeddability was demonstrated to exist when modelling sequence evolution with Markov models. Evidence of non-embeddability was found primarily at the third codon position, possibly resulting from changes in mutation rate over time. Outgroup edges and those with a deeper time depth were found to have an increased probability of the underlying process being non-embeddable. Overall, low levels of non-embeddability were detected when examining individual edges of triads across a diverse set of alignments. Subsequent phylogenetic reconstruction analyses demonstrated that non-embeddability could impact on the correct prediction of phylogenies, but at extremely low levels. Despite the existence of non-embeddability, there is minimal evidence of violations of the local time homogeneity assumption and consequently the impact is likely to be minor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号