首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Hidden Markov models (HMMs) are a class of stochastic models that have proven to be powerful tools for the analysis of molecular sequence data. A hidden Markov model can be viewed as a black box that generates sequences of observations. The unobservable internal state of the box is stochastic and is determined by a finite state Markov chain. The observable output is stochastic with distribution determined by the state of the hidden Markov chain. We present a Bayesian solution to the problem of restoring the sequence of states visited by the hidden Markov chain from a given sequence of observed outputs. Our approach is based on a Monte Carlo Markov chain algorithm that allows us to draw samples from the full posterior distribution of the hidden Markov chain paths. The problem of estimating the probability of individual paths and the associated Monte Carlo error of these estimates is addressed. The method is illustrated by considering a problem of DNA sequence multiple alignment. The special structure for the hidden Markov model used in the sequence alignment problem is considered in detail. In conclusion, we discuss certain interesting aspects of biological sequence alignments that become accessible through the Bayesian approach to HMM restoration.  相似文献   

2.
Pradel R 《Biometrics》2005,61(2):442-447
Capture-recapture models were originally developed to account for encounter probabilities that are less than 1 in free-ranging animal populations. Nowadays, these models can deal with the movement of animals between different locations and are also used to study transitions between different states. However, their use to estimate transitions between states does not account for uncertainty in state assignment. I present the extension of multievent models, which does incorporate this uncertainty. Multievent models belong to the family of hidden Markov models. I also show in this article that the memory model, in which the next state or location is influenced by the previous state occupied, can be fully treated within the framework of multievent models.  相似文献   

3.
Many single-molecule experiments aim to characterize biomolecular processes in terms of kinetic models that specify the rates of transition between conformational states of the biomolecule. Estimation of these rates often requires analysis of a population of molecules, in which the conformational trajectory of each molecule is represented by a noisy, time-dependent signal trajectory. Although hidden Markov models (HMMs) may be used to infer the conformational trajectories of individual molecules, estimating a consensus kinetic model from the population of inferred conformational trajectories remains a statistically difficult task, as inferred parameters vary widely within a population. Here, we demonstrate how a recently developed empirical Bayesian method for HMMs can be extended to enable a more automated and statistically principled approach to two widely occurring tasks in the analysis of single-molecule fluorescence resonance energy transfer (smFRET) experiments: 1), the characterization of changes in rates across a series of experiments performed under variable conditions; and 2), the detection of degenerate states that exhibit the same FRET efficiency but differ in their rates of transition. We apply this newly developed methodology to two studies of the bacterial ribosome, each exemplary of one of these two analysis tasks. We conclude with a discussion of model-selection techniques for determination of the appropriate number of conformational states. The code used to perform this analysis and a basic graphical user interface front end are available as open source software.  相似文献   

4.
Many single-molecule experiments aim to characterize biomolecular processes in terms of kinetic models that specify the rates of transition between conformational states of the biomolecule. Estimation of these rates often requires analysis of a population of molecules, in which the conformational trajectory of each molecule is represented by a noisy, time-dependent signal trajectory. Although hidden Markov models (HMMs) may be used to infer the conformational trajectories of individual molecules, estimating a consensus kinetic model from the population of inferred conformational trajectories remains a statistically difficult task, as inferred parameters vary widely within a population. Here, we demonstrate how a recently developed empirical Bayesian method for HMMs can be extended to enable a more automated and statistically principled approach to two widely occurring tasks in the analysis of single-molecule fluorescence resonance energy transfer (smFRET) experiments: 1), the characterization of changes in rates across a series of experiments performed under variable conditions; and 2), the detection of degenerate states that exhibit the same FRET efficiency but differ in their rates of transition. We apply this newly developed methodology to two studies of the bacterial ribosome, each exemplary of one of these two analysis tasks. We conclude with a discussion of model-selection techniques for determination of the appropriate number of conformational states. The code used to perform this analysis and a basic graphical user interface front end are available as open source software.  相似文献   

5.
This paper discusses a two‐state hidden Markov Poisson regression (MPR) model for analyzing longitudinal data of epileptic seizure counts, which allows for the rate of the Poisson process to depend on covariates through an exponential link function and to change according to the states of a two‐state Markov chain with its transition probabilities associated with covariates through a logit link function. This paper also considers a two‐state hidden Markov negative binomial regression (MNBR) model, as an alternative, by using the negative binomial instead of Poisson distribution in the proposed MPR model when there exists extra‐Poisson variation conditional on the states of the Markov chain. The two proposed models in this paper relax the stationary requirement of the Markov chain, allow for overdispersion relative to the usual Poisson regression model and for correlation between repeated observations. The proposed methodology provides a plausible analysis for the longitudinal data of epileptic seizure counts, and the MNBR model fits the data much better than the MPR model. Maximum likelihood estimation using the EM and quasi‐Newton algorithms is discussed. A Monte Carlo study for the proposed MPR model investigates the reliability of the estimation method, the choice of probabilities for the initial states of the Markov chain, and some finite sample behaviors of the maximum likelihood estimates, suggesting that (1) the estimation method is accurate and reliable as long as the total number of observations is reasonably large, and (2) the choice of probabilities for the initial states of the Markov process has little impact on the parameter estimates.  相似文献   

6.
BioHMM: a heterogeneous hidden Markov model for segmenting array CGH data   总被引:3,自引:0,他引:3  
SUMMARY: We have developed a new method (BioHMM) for segmenting array comparative genomic hybridization data into states with the same underlying copy number. By utilizing a heterogeneous hidden Markov model, BioHMM incorporates relevant biological factors (e.g. the distance between adjacent clones) in the segmentation process.  相似文献   

7.
Recently, there have been remarkable advances in modeling the relationships between the sensory environment, neuronal responses, and behavior. However, most models cannot encompass variable stimulus-response relationships such as varying response latencies and state or context dependence of the neural code. Here, we consider response modeling as a dynamic alignment problem and model stimulus and response jointly by a mixed pair hidden Markov model (MPH). In MPHs, multiple stimulus-response relationships (e.g., receptive fields) are represented by different states or groups of states in a Markov chain. Each stimulus-response relationship features temporal flexibility, allowing modeling of variable response latencies, including noisy ones. We derive algorithms for learning of MPH parameters and for inference of spike response probabilities. We show that some linear-nonlinear Poisson cascade (LNP) models are a special case of MPHs. We demonstrate the efficiency and usefulness of MPHs in simulations of both jittered and switching spike responses to white noise and natural stimuli. Furthermore, we apply MPHs to extracellular single and multi-unit data recorded in cortical brain areas of singing birds to showcase a novel method for estimating response lag distributions. MPHs allow simultaneous estimation of receptive fields, latency statistics, and hidden state dynamics and so can help to uncover complex stimulus response relationships that are subject to variable timing and involve diverse neural codes.  相似文献   

8.
McKinney SA  Joo C  Ha T 《Biophysical journal》2006,91(5):1941-1951
The analysis of single-molecule fluorescence resonance energy transfer (FRET) trajectories has become one of significant biophysical interest. In deducing the transition rates between various states of a system for time-binned data, researchers have relied on simple, but often arbitrary methods of extracting rates from FRET trajectories. Although these methods have proven satisfactory in cases of well-separated, low-noise, two- or three-state systems, they become less reliable when applied to a system of greater complexity. We have developed an analysis scheme that casts single-molecule time-binned FRET trajectories as hidden Markov processes, allowing one to determine, based on probability alone, the most likely FRET-value distributions of states and their interconversion rates while simultaneously determining the most likely time sequence of underlying states for each trajectory. Together with a transition density plot and Bayesian information criterion we can also determine the number of different states present in a system in addition to the state-to-state transition probabilities. Here we present the algorithm and test its limitations with various simulated data and previously reported Holliday junction data. The algorithm is then applied to the analysis of the binding and dissociation of three RecA monomers on a DNA construct.  相似文献   

9.
A few models have appeared in recent years that consider not only the way substitutions occur through evolutionary history at each site of a genome, but also the way the process changes from one site to the next. These models combine phylogenetic models of molecular evolution, which apply to individual sites, and hidden Markov models, which allow for changes from site to site. Besides improving the realism of ordinary phylogenetic models, they are potentially very powerful tools for inference and prediction--for example, for gene finding or prediction of secondary structure. In this paper, we review progress on combined phylogenetic and hidden Markov models and present some extensions to previous work. Our main result is a simple and efficient method for accommodating higher-order states in the HMM, which allows for context-dependent models of substitution--that is, models that consider the effects of neighboring bases on the pattern of substitution. We present experimental results indicating that higher-order states, autocorrelated rates, and multiple functional categories all lead to significant improvements in the fit of a combined phylogenetic and hidden Markov model, with the effect of higher-order states being particularly pronounced.  相似文献   

10.
Hidden Markov modelling is a powerful and efficient digital signal processing strategy for extracting the maximum likelihood model from a finite length sample of noisy data. Assuming the number of states in the model is known, then the state levels, transition probabilities, initial state distribution and the noise variance can be estimated. We investigate the applicability of this technique in membrane channel kinetics not only as a parameter estimator, but also as an aid to discriminating between various model types according to their statistical likelihood. We survey three representative classes of channel dynamics, namely: aggregated Markov models, semi-Markov models (with asymptotically convergent transition probabilities), and coupled Markov models; reformulating each within a discrete-time hidden Markov model framework. We then provide numerical evidence of the effectiveness of the procedure using simulated channel data and hence show that the correct model, as well as the model parameters, can be discerned. We also demonstrate that the model likelihood can be used to indicate the approximate number of states in the model.  相似文献   

11.
12.
Usually in capture–recapture, a model parameter is time or time since first capture dependent. However, the case where the probability of staying in one state depends on the time spent in that particular state is not rare. Hidden Markov models are not appropriate to manage these situations. A more convenient approach would be to consider models that incorporate semi‐Markovian states which explicitly define the waiting time distribution and have been used in previous biologic studies as a convenient framework for modeling the time spent in a given physiological state. Here, we propose hidden Markovian models that combine several nonhomogeneous Markovian states with one semi‐Markovian state and which (i) are well adapted to imperfect and variable detection and (ii) allow us to consider time, time since first capture, and time spent in one state effects. Implementation details depending on the number of semi‐Markovian states are discussed. From a user's perspective, the present approach enhances the toolbox for analyzing capture–recapture data. We then show the potential of this framework by means of two ecological examples: (i) stopover duration and (ii) breeding success dynamics.  相似文献   

13.
We investigate models for animal feeding behaviour, with the aim of improving understanding of how animals organise their behaviour in the short term. We consider three classes of model: hidden Markov, latent Gaussian and semi-Markov. Each can predict the typical 'clustered' feeding behaviour that is generally observed, however they differ in the extent to which 'memory' of previous behaviour is allowed to affect future behaviour. The hidden Markov model has 'lack of memory', the current behavioural state being dependent on the previous state only. The latent Gaussian model assumes feeding/non-feeding periods to occur by the thresholding of an underlying continuous variable, thereby incorporating some 'short-term memory'. The semi-Markov model, by taking into account the duration of time spent in the previous state, can be said to incorporate 'longer-term memory'. We fit each of these models to a dataset of cow feeding behaviour. We find the semi-Markov model (longer-term memory) to have the best fit to the data and the hidden Markov model (lack of memory) the worst. We argue that in view of effects of satiety on short-term feeding behaviour of animal species in general, biologically suitable models should allow 'memory' to play a role. We conclude that our findings are equally relevant for the analysis of other types of short-term behaviour that are governed by satiety-like principles.  相似文献   

14.
Single molecule FRET for the study on structural dynamics of biomolecules   总被引:2,自引:0,他引:2  
Single molecule fluorescence resonance energy transfer (FRET) is the technique that has been developed by combining FRET measurement and single molecule fluorescence imaging. This technique allows us to measure the dynamic changes of the interaction and structures of biomolecules. In this study, the validity of the method was tested using fluorescence dyes on double stranded DNA molecules as a rigid spacer. FRET signals from double stranded DNA molecules were stable and their average FRET values provided the distance between the donor and acceptor in agreement with B-DNA type helix model. Next, the single molecule FRET method was applied to the studies on the dynamic structure of Ras, a signaling protein. The data showed that Ras has multiple conformational states and undergoes transition between them. This study on the dynamic conformation of Ras provided a clue for understanding the molecular mechanism of cell signaling switches.  相似文献   

15.
Complex sequencing rules observed in birdsongs provide an opportunity to investigate the neural mechanism for generating complex sequential behaviors. To relate the findings from studying birdsongs to other sequential behaviors such as human speech and musical performance, it is crucial to characterize the statistical properties of the sequencing rules in birdsongs. However, the properties of the sequencing rules in birdsongs have not yet been fully addressed. In this study, we investigate the statistical properties of the complex birdsong of the Bengalese finch (Lonchura striata var. domestica). Based on manual-annotated syllable labeles, we first show that there are significant higher-order context dependencies in Bengalese finch songs, that is, which syllable appears next depends on more than one previous syllable. We then analyze acoustic features of the song and show that higher-order context dependencies can be explained using first-order hidden state transition dynamics with redundant hidden states. This model corresponds to hidden Markov models (HMMs), well known statistical models with a large range of application for time series modeling. The song annotation with these models with first-order hidden state dynamics agreed well with manual annotation, the score was comparable to that of a second-order HMM, and surpassed the zeroth-order model (the Gaussian mixture model; GMM), which does not use context information. Our results imply that the hierarchical representation with hidden state dynamics may underlie the neural implementation for generating complex behavioral sequences with higher-order dependencies.  相似文献   

16.
Surveillance data for communicable nosocomial pathogens usually consist of short time series of low-numbered counts of infected patients. These often show overdispersion and autocorrelation. To date, almost all analyses of such data have ignored the communicable nature of the organisms and have used methods appropriate only for independent outcomes. Inferences that depend on such analyses cannot be considered reliable when patient-to-patient transmission is important. We propose a new method for analysing these data based on a mechanistic model of the epidemic process. Since important nosocomial pathogens are often carried asymptomatically with overt infection developing in only a proportion of patients, the epidemic process is usually only partially observed by routine surveillance data. We therefore develop a 'structured' hidden Markov model where the underlying Markov chain is generated by a simple transmission model. We apply both structured and standard (unstructured) hidden Markov models to time series for three important pathogens. We find that both methods can offer marked improvements over currently used approaches when nosocomial spread is important. Compared to the standard hidden Markov model, the new approach is more parsimonious, is more biologically plausible, and allows key epidemiological parameters to be estimated.  相似文献   

17.
Summary Continuous‐time multistate models are widely used for categorical response data, particularly in the modeling of chronic diseases. However, inference is difficult when the process is only observed at discrete time points, with no information about the times or types of events between observation times, unless a Markov assumption is made. This assumption can be limiting as rates of transition between disease states might instead depend on the time since entry into the current state. Such a formulation results in a semi‐Markov model. We show that the computational problems associated with fitting semi‐Markov models to panel‐observed data can be alleviated by considering a class of semi‐Markov models with phase‐type sojourn distributions. This allows methods for hidden Markov models to be applied. In addition, extensions to models where observed states are subject to classification error are given. The methodology is demonstrated on a dataset relating to development of bronchiolitis obliterans syndrome in post‐lung‐transplantation patients.  相似文献   

18.
MSMBuilder is a software package for building statistical models of high-dimensional time-series data. It is designed with a particular focus on the analysis of atomistic simulations of biomolecular dynamics such as protein folding and conformational change. MSMBuilder is named for its ability to construct Markov state models (MSMs), a class of models that has gained favor among computational biophysicists. In addition to both well-established and newer MSM methods, the package includes complementary algorithms for understanding time-series data such as hidden Markov models and time-structure based independent component analysis. MSMBuilder boasts an easy to use command-line interface, as well as clear and consistent abstractions through its Python application programming interface. MSMBuilder was developed with careful consideration for compatibility with the broader machine learning community by following the design of scikit-learn. The package is used primarily by practitioners of molecular dynamics, but is just as applicable to other computational or experimental time-series measurements.  相似文献   

19.
The hidden Markov model (HMM) is a framework for time series analysis widely applied to single-molecule experiments. Although initially developed for applications outside the natural sciences, the HMM has traditionally been used to interpret signals generated by physical systems, such as single molecules, evolving in a discrete state space observed at discrete time levels dictated by the data acquisition rate. Within the HMM framework, transitions between states are modeled as occurring at the end of each data acquisition period and are described using transition probabilities. Yet, whereas measurements are often performed at discrete time levels in the natural sciences, physical systems evolve in continuous time according to transition rates. It then follows that the modeling assumptions underlying the HMM are justified if the transition rates of a physical process from state to state are small as compared to the data acquisition rate. In other words, HMMs apply to slow kinetics. The problem is, because the transition rates are unknown in principle, it is unclear, a priori, whether the HMM applies to a particular system. For this reason, we must generalize HMMs for physical systems, such as single molecules, because these switch between discrete states in “continuous time”. We do so by exploiting recent mathematical tools developed in the context of inferring Markov jump processes and propose the hidden Markov jump process. We explicitly show in what limit the hidden Markov jump process reduces to the HMM. Resolving the discrete time discrepancy of the HMM has clear implications: we no longer need to assume that processes, such as molecular events, must occur on timescales slower than data acquisition and can learn transition rates even if these are on the same timescale or otherwise exceed data acquisition rates.  相似文献   

20.
Important methods for calculating likelihoods of genealogical relationships and mapping genes are based on hidden Markov models for the process of identity by descent along chromosomes. The computational time for the algorithms depends critically on the size of the statespace of the hidden Markov model. We describe the maximal grouping together of states of the model to reduce the size of the statespace. This grouping is based on pedigree symmetries. We also present an efficient algorithm for finding the maximal grouping.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号