首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Markov chain models are frequently used for studying event histories that include transitions between several states. An empirical transition matrix for nonhomogeneous Markov chains has previously been developed, including a detailed statistical theory based on counting processes and martingales. In this article, we show how to estimate transition probabilities dependent on covariates. This technique may, e.g., be used for making estimates of individual prognosis in epidemiological or clinical studies. The covariates are included through nonparametric additive models on the transition intensities of the Markov chain. The additive model allows for estimation of covariate-dependent transition intensities, and again a detailed theory exists based on counting processes. The martingale setting now allows for a very natural combination of the empirical transition matrix and the additive model, resulting in estimates that can be expressed as stochastic integrals, and hence their properties are easily evaluated. Two medical examples will be given. In the first example, we study how the lung cancer mortality of uranium miners depends on smoking and radon exposure. In the second example, we study how the probability of being in response depends on patient group and prophylactic treatment for leukemia patients who have had a bone marrow transplantation. A program in R and S-PLUS that can carry out the analyses described here has been developed and is freely available on the Internet.  相似文献   

2.
This paper analyzes the nucleotide sequences of three viruses: Kunjin, west Nile, and yellow fever. Each virus has one long open reading frame of greater than 10,200 nucleotides that codes for four structural and seven nonstructural genes. The Kunjin and west Nile viruses are the most closely related pair, when assessed on the basis of matches between their nucleotide sequences. As would be expected, the matching is least for bases at third-position codon sites and is greatest for second-position sites. Statistics are presented for the numbers of mismatches that are transitions or transversions. Nucleotide base usage is also reported. To each of the 33 virus-gene segments, nonhomogeneous Markov chain models have been fitted to describe the sequences of nucleotide bases. The models allow for different transition probabilities ("transition" is used in the mathematical sense here) and for different degrees of dependency, at the three sites in the codons. Reasonably satisfactory fits can be obtained for many of the genes by using models that are first order for both first- and second-position sites in the codon but that are second order for third-position sites. One consequence of such a model is that the correlation between one amino acid and the next is limited to the correlation of the last base of the former with the first base of the latter. Other consequences are that the model can (and does) prohibit the occurrence of stop codons within a gene and that subsequences of only first-position bases, or only third-position bases, are also first-order Markov chains. In theory, second-position subsequences may not be Markov chains at all. In practice, the data suggest that each of these subsequences is effectively a zero-order Markov chain, i.e., bases spaced three apart are statistically independent. Stationarity of nucleotide base distributions can be interpreted in either of two ways: (1) spatially along the sites or (2) temporally at each site. These interpretations must often be inconsistent, when the former allows for Markov dependence between adjacent sites whereas the latter assumes independence between sites. The inconsistency can be overcome, for these viruses, if subsequences at different codon positions are analyzed separately.  相似文献   

3.
C Fuchs 《Gene》1980,10(4):371-373
Several Markov chain models (up to fourth order) have been fitted to the sequences of the seven DNAs presented in Fuchs et al. (1980). Two methods for determining the order of Markov chain are applied to the data. The two methods lead to different conclusions and we dicuss these discrepancies. When the distribution of the nucleotides in a DNA sequence is investigated, it is suggested that the study on the order of the Markov model should be supplemented with additional analysis.  相似文献   

4.
Summary A vacancy chain is a unique type of resource acquisition process composed of an interconnected series of events in which the gaining of a particular resource unit by one individual depends directly on prior acquisition events by other individuals. Taken from the sociological literature, vacancy chains may also describe the distribution of many types of animal resources such as burrows, dwellings and shelters. Using data on hermit crabs, we present a Markov model simulating a vacancy chain process, and test the model against field data. Our results show that a simple Markov model adequately describes shell acquisition in hermit crabs, and that models combining shell size and crude estimates of quality fit the data extremely well. We illustrate in detail how to generate vacancy chain models from ecological data, how to determine the number and size of organisms gaining new resource units from resource introductions of specific sizes, and how to statistically evaluate the accuracy of Markov models. Not recognizing the presence of a vacancy chain system may lead to serious errors in estimating resource dynamics and therefore in demographic and competition models based on these dynamics. Finally, we suggest some ways in which vacancy chain models can aid studies of competition, population dynamics, life histories, and conservation in species using this type of resource acquisition process.  相似文献   

5.
MOTIVATION: Markov chain models of DNA sequences have frequently been used in gene finding algorithms. Performance of the algorithm critically depends on the model structure and parameters. Still, the issue of choosing the model structure has not been studied with sufficient attention. RESULTS: We have assessed performance of several types of Markov chain models, both fixed order (FO) models and models with interpolation, within the framework of the GeneMark algorithm. The performance was measured in two ways: (i) the accuracy of detection of protein-coding potential in artificial DNA sequences and (ii) the accuracy of identifying genes in real prokaryotic genomes. We observed that the models built by deleted interpolation (DI) slightly outperformed other models in detecting protein-coding potential in artificial DNA sequences with GC content in the medium range and also in detecting genes in real genomes with medium GC content. For artificial and real genomic DNA with high or low GC content, we observed that the models built by DI were in some cases slightly outperformed by FO models.  相似文献   

6.
Several Markov chain models (up to fourth order) have been fitted to the sequences of the seven DNAs presented in Fuchs et al. (1980). Two methods for determining the order of Markov chain are applied to the data. The two methods lead to different conclusions and we dicuss these discrepancies. When the distribution of the nucleotides in a DNA sequence is investigated, it is suggested that the study on the order of the Markov model should be supplemented with additional analysis.  相似文献   

7.
8.
We illustrate through examples how monotonicity may help for performance evaluation of networks. We consider two different applications of stochastic monotonicity in performance evaluation. In the first one, we assume that a Markov chain of the model depends on a parameter that can be estimated only up to a certain level and we have only an interval that contains the exact value of the parameter. Instead of taking an approximated value for the unknown parameter, we show how we can use the monotonicity properties of the Markov chain to take into account the error bound from the measurements. In the second application, we consider a well known approximation method: the decomposition into Markovian submodels. In such an approach, models of complex networks or other systems are decomposed into Markovian submodels whose results are then used as parameters for the next submodel in an iterative computation. One obtains a fixed point system which is solved numerically. In general, we have neither an existence proof of the solution of the fixed point system nor a convergence proof of the iterative algorithm. Here we show how stochastic monotonicity can be used to answer these questions and provide, to some extent, the theoretical foundations for this approach. Furthermore, monotonicity properties can also help to derive more efficient algorithms to solve fixed point systems.  相似文献   

9.
Liu L  Ho YK  Yau S 《DNA and cell biology》2007,26(7):477-483
The inhomogeneous Markov chain model is used to discriminate acceptor and donor sites in genomic DNA sequences. It outperforms statistical methods such as homogeneous Markov chain model, higher order Markov chain and interpolated Markov chain models, and machine-learning methods such as k-nearest neighbor and support vector machine as well. Besides its high accuracy, another advantage of inhomogeneous Markov chain model is its simplicity in computation. In the three states system (acceptor, donor, and neither), the inhomogeneous Markov chain model is combined with a three-layer feed forward neural network. Using this combined system 3175 primate splice-junction gene sequences have been tested, with a prediction accuracy of greater than 98%.  相似文献   

10.
Models of calcium (Ca(2 +)) release sites derived from continuous-time Markov chain (CTMC) models of intracellular Ca(2 +) channels exhibit collective gating reminiscent of the experimentally observed phenomenon of Ca(2 +) puffs and sparks. In order to overcome the state-space explosion that occurs in compositionally defined Ca(2 +) release site models, we have implemented an automated procedure for model reduction that replaces aggregated states of the full release site model with much simpler CTMCs that have similar within-group phase-type sojourn times and inter-group transitions. Error analysis based on comparison of full and reduced models validates the method when applied to release site models composed of 20 three-state channels that are both activated and inactivated by Ca(2 +). Although inspired by existing techniques for fitting moments of phase-type distributions, the automated reduction method for compositional Ca(2 +) release site models is unique in several respects and novel in this biophysical context.  相似文献   

11.
12.
The rates of functional recovery after stroke tend to decrease with time. Time-varying Markov processes (TVMP) may be more biologically plausible than time-invariant Markov process for modeling such data. However, analysis of such stochastic processes, particularly tackling reversible transitions and the incorporation of random effects into models, can be analytically intractable. We make use of ordinary differential equations to solve continuous-time TVMP with reversible transitions. The proportional hazard form was used to assess the effects of an individual’s covariates on multi-state transitions with the incorporation of random effects that capture the residual variation after being explained by measured covariates under the concept of generalized linear model. We further built up Bayesian directed acyclic graphic model to obtain full joint posterior distribution. Markov chain Monte Carlo (MCMC) with Gibbs sampling was applied to estimate parameters based on posterior marginal distributions with multiple integrands. The proposed method was illustrated with empirical data from a study on the functional recovery after stroke.  相似文献   

13.
In this paper the motion of a single cell is modeled as a nucleus and multiple integrin based adhesion sites. Numerical simulations and analysis of the model indicate that when the stochastic nature of the adhesion sites is a memoryless and force independent random process, the cell speed is independent of the force these adhesion sites exert on the cell. Furthermore, understanding the dynamics of the attachment and detachment of the adhesion sites is key to predicting cell speed. We introduce a differential equation describing the cell motion and then introduce a conjecture about the expected drift of the cell, the expected average velocity relation conjecture. Using Markov chain theory, we analyze our conjecture in the context of a related (but simpler) model of cell motion, and then numerically compare the results for the simpler model and the full differential equation model. We also heuristically describe the relationship between the simplified and full models as well as provide a discussion of the biological significance of these results.  相似文献   

14.
Pradel R 《Biometrics》2005,61(2):442-447
Capture-recapture models were originally developed to account for encounter probabilities that are less than 1 in free-ranging animal populations. Nowadays, these models can deal with the movement of animals between different locations and are also used to study transitions between different states. However, their use to estimate transitions between states does not account for uncertainty in state assignment. I present the extension of multievent models, which does incorporate this uncertainty. Multievent models belong to the family of hidden Markov models. I also show in this article that the memory model, in which the next state or location is influenced by the previous state occupied, can be fully treated within the framework of multievent models.  相似文献   

15.
Raberto M  Rapallo F  Scalas E 《PloS one》2011,6(8):e23370
In this paper, we outline a model of graph (or network) dynamics based on two ingredients. The first ingredient is a Markov chain on the space of possible graphs. The second ingredient is a semi-Markov counting process of renewal type. The model consists in subordinating the Markov chain to the semi-Markov counting process. In simple words, this means that the chain transitions occur at random time instants called epochs. The model is quite rich and its possible connections with algebraic geometry are briefly discussed. Moreover, for the sake of simplicity, we focus on the space of undirected graphs with a fixed number of nodes. However, in an example, we present an interbank market model where it is meaningful to use directed graphs or even weighted graphs.  相似文献   

16.
In this paper we analyze the isolation-with-migration model in a continuous-time Markov chain framework, and derive analytical expressions for the probability densities of gene tree topologies with an arbitrary number of lineages. We combine these densities with both nucleotide-substitution and infinite sites mutation models and derive probabilities for use in maximum likelihood estimation. We demonstrate how to apply lumpability of continuous-time Markov chains to achieve a significant reduction in the size of the state-space under consideration. We use matrix exponentiation and spectral decomposition to derive explicit expressions for the case of two diploid individuals in two populations, when the data is given as alignment columns. We implement these expressions in order to carry out a maximum likelihood analysis and provide a simulation study to examine the performance of our method in terms of our ability to recover true parameters. Finally, we show how the performance depends on the parameters in the model.  相似文献   

17.
Improved efficiency of Markov chain Monte Carlo facilitates all aspects of statistical analysis with Bayesian hierarchical models. Identifying strategies to improve MCMC performance is becoming increasingly crucial as the complexity of models, and the run times to fit them, increases. We evaluate different strategies for improving MCMC efficiency using the open‐source software NIMBLE (R package nimble) using common ecological models of species occurrence and abundance as examples. We ask how MCMC efficiency depends on model formulation, model size, data, and sampling strategy. For multiseason and/or multispecies occupancy models and for N‐mixture models, we compare the efficiency of sampling discrete latent states vs. integrating over them, including more vs. fewer hierarchical model components, and univariate vs. block‐sampling methods. We include the common MCMC tool JAGS in comparisons. For simple models, there is little practical difference between computational approaches. As model complexity increases, there are strong interactions between model formulation and sampling strategy on MCMC efficiency. There is no one‐size‐fits‐all best strategy, but rather problem‐specific best strategies related to model structure and type. In all but the simplest cases, NIMBLE's default or customized performance achieves much higher efficiency than JAGS. In the two most complex examples, NIMBLE was 10–12 times more efficient than JAGS. We find NIMBLE is a valuable tool for many ecologists utilizing Bayesian inference, particularly for complex models where JAGS is prohibitively slow. Our results highlight the need for more guidelines and customizable approaches to fit hierarchical models to ensure practitioners can make the most of occupancy and other hierarchical models. By implementing model‐generic MCMC procedures in open‐source software, including the NIMBLE extensions for integrating over latent states (implemented in the R package nimbleEcology), we have made progress toward this aim.  相似文献   

18.
A survey is given of continuous-time Markov chain models for ionizing radiation damage to the genome of mammalian cells. In such models, immediate damage induced by the radiation is regarded as a batch-Poisson arrival process of DNA double-strand breaks (DSBs). Enzymatic modification of the immediate damage is modeled as a Markov process similar to those described by the master equation of stochastic chemical kinetics. An illustrative example is the restitution/complete-exchange model. The model postulates that, after being induced by radiation, DSBs subsequently either undergo enzymatically mediated restitution (repair) or participate pairwise in chromosome exchanges. Some of the exchanges make irremediable lesions such as dicentric chromosome aberrations. One may have rapid irradiation followed by enzymatic DSB processing or have prolonged irradiation with both DSB arrival and enzymatic DSB processing continuing throughout the irradiation period. Methods for analyzing the Markov chains include using an approximate model for expected values, the discrete-time Markov chain embedded at transitions, partial differential equations for generating functions, normal perturbation theory, singular perturbation theory with scaling, numerical computations, and certain matrix methods that combine Perron-Frobenius theory with variational estimates. Applications to experimental results on expected values, variances, and statistical distributions of DNA lesions are briefly outlined. Continuous-time Markov chains are the most systematic of those radiation damage models that treat DSB-DSB interactions within the cell nucleus as homogeneous (e.g., ignore diffusion limitations). They contain virtually all other relevant homogeneous models and semiempirical summaries as special cases, limiting cases, or approximations. However, the Markov models do not seem to be well suited for studying spatial dependence of DSB interactions, which is known to be important in some situations.  相似文献   

19.
This paper proposes the use of hidden Markov time series models for the analysis of the behaviour sequences of one or more animals under observation. These models have advantages over the Markov chain models commonly used for behaviour sequences, as they can allow for time-trend or expansion to several subjects without sacrificing parsimony. Furthermore, they provide an alternative to higher-order Markov chain models if a first-order Markov chain is unsatisfactory as a model. To illustrate the use of such models, we fit multivariate and univariate hidden Markov models allowing for time-trend to data from an experiment investigating the effects of feeding on the locomotory behaviour of locusts (Locusta migratoria).  相似文献   

20.
An improved Markov chain model has been developed for forecasting of sugarcane yields in which growth indices of biometrical characters based on data from two stages simultaneously have been utilised. Comparisons were also made with the models in use viz. the regression model and the first order Markov chain model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号