期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian restoration of a hidden Markov chain with applications to DNA sequencing.

G A Churchill B Lazareva 《Journal of computational biology》1999,6(2):261-277

Hidden Markov models (HMMs) are a class of stochastic models that have proven to be powerful tools for the analysis of molecular sequence data. A hidden Markov model can be viewed as a black box that generates sequences of observations. The unobservable internal state of the box is stochastic and is determined by a finite state Markov chain. The observable output is stochastic with distribution determined by the state of the hidden Markov chain. We present a Bayesian solution to the problem of restoring the sequence of states visited by the hidden Markov chain from a given sequence of observed outputs. Our approach is based on a Monte Carlo Markov chain algorithm that allows us to draw samples from the full posterior distribution of the hidden Markov chain paths. The problem of estimating the probability of individual paths and the associated Monte Carlo error of these estimates is addressed. The method is illustrated by considering a problem of DNA sequence multiple alignment. The special structure for the hidden Markov model used in the sequence alignment problem is considered in detail. In conclusion, we discuss certain interesting aspects of biological sequence alignments that become accessible through the Bayesian approach to HMM restoration. 相似文献

2.

Markov chains: computing limit existence and approximations with DNA

Cardona M Colomer MA Conde J Miret JM Miró J Zaragoza A 《Bio Systems》2005,81(3):261-266

We present two algorithms to perform computations over Markov chains. The first one determines whether the sequence of powers of the transition matrix of a Markov chain converges or not to a limit matrix. If it does converge, the second algorithm enables us to estimate this limit. The combination of these algorithms allows the computation of a limit using DNA computing. In this sense, we have encoded the states and the transition probabilities using strands of DNA for generating paths of the Markov chain. 相似文献

3.

Derivatives of the stochastic growth rate

Steinsaltz D Tuljapurkar S Horvitz C 《Theoretical population biology》2011,80(1):1-15

We consider stochastic matrix models for population driven by random environments which form a Markov chain. The top Lyapunov exponent a, which describes the long-term growth rate, depends smoothly on the demographic parameters (represented as matrix entries) and on the parameters that define the stochastic matrix of the driving Markov chain. The derivatives of a-the “stochastic elasticities”-with respect to changes in the demographic parameters were derived by Tuljapurkar (1990). These results are here extended to a formula for the derivatives with respect to changes in the Markov chain driving the environments. We supplement these formulas with rigorous bounds on computational estimation errors, and with rigorous derivations of both the new and old formulas. 相似文献

4.

Coalescent process with fluctuating population size and its effective size 总被引：3，自引：0，他引：3

Sano A Shimizu A Iizuka M 《Theoretical population biology》2004,65(1):39-48

We consider a Wright-Fisher model whose population size is a finite Markov chain. We introduce a sequence of two-dimensional discrete time Markov chains whose components describe the coalescent process and the fluctuation of population size. For the limiting process of the sequence of Markov chains, the relationship of the expectation of coalescence time to the harmonic and the arithmetic means of population sizes is shown, and the Laplace transform of the distribution of coalescence time is calculated. We define the coalescence effective population size (cEPS) by the expectation of coalescence time. We show that cEPS is strictly larger (resp. smaller) than the harmonic (resp. arithmetic) mean. As the population size fluctuates more quickly (resp. slowly), cEPS is closer to the harmonic (resp. arithmetic) mean. For the case of a two-valued Markov chain, we show the explicit expression of cEPS and its dependency on the sample size. 相似文献

5.

Prediction of primate splice site using inhomogeneous Markov chain and neural network

Liu L Ho YK Yau S 《DNA and cell biology》2007,26(7):477-483

The inhomogeneous Markov chain model is used to discriminate acceptor and donor sites in genomic DNA sequences. It outperforms statistical methods such as homogeneous Markov chain model, higher order Markov chain and interpolated Markov chain models, and machine-learning methods such as k-nearest neighbor and support vector machine as well. Besides its high accuracy, another advantage of inhomogeneous Markov chain model is its simplicity in computation. In the three states system (acceptor, donor, and neither), the inhomogeneous Markov chain model is combined with a three-layer feed forward neural network. Using this combined system 3175 primate splice-junction gene sequences have been tested, with a prediction accuracy of greater than 98%. 相似文献

6.

The Metropolized Partial Importance Sampling MCMC mixes slowly on minimum reversal rearrangement paths

Miklós I Mélykúti B Swenson K 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2010,7(4):763-767

Markov chain Monte Carlo has been the standard technique for inferring the posterior distribution of genome rearrangement scenarios under a Bayesian approach. We present here a negative result on the rate of convergence of the generally used Markov chains. We prove that the relaxation time of the Markov chains walking on the optimal reversal sorting scenarios might grow exponentially with the size of the signed permutations, namely, with the number of syntheny blocks. 相似文献

7.

Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo

Pagel M Meade A 《The American naturalist》2006,167(6):808-825

We describe a Bayesian method for investigating correlated evolution of discrete binary traits on phylogenetic trees. The method fits a continuous-time Markov model to a pair of traits, seeking the best fitting models that describe their joint evolution on a phylogeny. We employ the methodology of reversible-jump (RJ) Markov chain Monte Carlo to search among the large number of possible models, some of which conform to independent evolution of the two traits, others to correlated evolution. The RJ Markov chain visits these models in proportion to their posterior probabilities, thereby directly estimating the support for the hypothesis of correlated evolution. In addition, the RJ Markov chain simultaneously estimates the posterior distributions of the rate parameters of the model of trait evolution. These posterior distributions can be used to test among alternative evolutionary scenarios to explain the observed data. All results are integrated over a sample of phylogenetic trees to account for phylogenetic uncertainty. We implement the method in a program called RJ Discrete and illustrate it by analyzing the question of whether mating system and advertisement of estrus by females have coevolved in the Old World monkeys and great apes. 相似文献

8.

The influence of internal fluctuations and external noise on the gene frequency in a haploid population

Mark F. Schumaker W. Horsthemke 《Bulletin of mathematical biology》1985,47(4):513-533

We consider a Markov chain modeling competition between two alleles in a haploid population of constant size and undergoing mutation, selection and Fisher-Wright mating. The Markov chain is rescaled to a diffusion process. We study the interaction of external noise (due to a random selection coefficient) and internal fluctuations (due to mating); the interaction is found to be additive. The stationary probability density displays a critical point. We draw an analogy between the behavior of the probability density at the critical point and the theory of phase transitions; critical exponents are introduced and calculated. We also analyze the effect of external noise on the genetic diversity of the population and on mean first passage times of the gene frequency. 相似文献

9.

Markovian approximation to the finite loci coalescent with recombination along multiple sequences

《Theoretical population biology》2014

The coalescent with recombination process has initially been formulated backwards in time, but simulation algorithms and inference procedures often apply along sequences. Therefore it is of major interest to approximate the coalescent with recombination process by a Markov chain along sequences. We consider the finite loci case and two or more sequences. We formulate a natural Markovian approximation for the tree building process along the sequences, and derive simple and analytically tractable formulae for the distribution of the tree at the next locus conditioned on the tree at the present locus. We compare our Markov approximation to other sequential Markov chains and discuss various applications. 相似文献

10.

Quasi-stationary distributions of a stochastic metapopulation model 总被引：2，自引：0，他引：2

Mats Gyllenberg Dmitrii S. Silvestrov 《Journal of mathematical biology》1994,33(1):35-70

相似文献

11.

Detecting Memory and Structure in Human Navigation Patterns Using Markov Chain Models of Varying Order

Philipp Singer Denis Helic Behnam Taraghi Markus Strohmaier 《PloS one》2014,9(7)

One of the most frequently used models for understanding human navigation on the Web is the Markov chain model, where Web pages are represented as states and hyperlinks as probabilities of navigating from one page to another. Predominantly, human navigation on the Web has been thought to satisfy the memoryless Markov property stating that the next page a user visits only depends on her current page and not on previously visited ones. This idea has found its way in numerous applications such as Google''s PageRank algorithm and others. Recently, new studies suggested that human navigation may better be modeled using higher order Markov chain models, i.e., the next page depends on a longer history of past clicks. Yet, this finding is preliminary and does not account for the higher complexity of higher order Markov chain models which is why the memoryless model is still widely used. In this work we thoroughly present a diverse array of advanced inference methods for determining the appropriate Markov chain order. We highlight strengths and weaknesses of each method and apply them for investigating memory and structure of human navigation on the Web. Our experiments reveal that the complexity of higher order models grows faster than their utility, and thus we confirm that the memoryless model represents a quite practical model for human navigation on a page level. However, when we expand our analysis to a topical level, where we abstract away from specific page transitions to transitions between topics, we find that the memoryless assumption is violated and specific regularities can be observed. We report results from experiments with two types of navigational datasets (goal-oriented vs. free form) and observe interesting structural differences that make a strong argument for more contextual studies of human navigation in future work. 相似文献

12.

Correlation of pause sites in MDV-1 RNA replication with kinetic refolding of the growing chain. A Monte Carlo simulation of the Markov process

A Fernández 《European journal of biochemistry》1989,182(1):161-163

We consider a template-instructed MDV-1 RNA replication catalyzed by Q beta replicase. The relaxation of the effective binding of the growing chain to the enzyme is simulated, considering a Markov process where refolding events in the chain are concomitant with chain elongation events. When realistic transition probabilities are considered, it becomes evident that the pause sites in replication are due to relaxation of the binding by folding of the growing chain. 相似文献

13.

Wiener analysis of functionals of a Markov chain: Application to neural transformations of random signals

John P. Kroeker 《Biological cybernetics》1980,36(4):243-248

The Wiener method of nonlinear system identification is extended to systems with a Markov chain input. Multivariate functionals are constructed that are orthonormal with respect to the probability measure of the Markov input. Any system operating on a Markov chain may be represented by an orthogonal expansion in these functionals. The coefficients of the orthogonal expansion may be evaluated by crosscorrelation. Application of this technique to nonlinear neural systems with a Markov actionpotential input are discussed. 相似文献

14.

Past states of continuous-time Markov models for ecological communities

Spencer M 《Mathematical biosciences》2008,211(2):299-313

Discrete-time Markov chains are often used to model communities of sessile organisms. The community is described by a set of discrete states, which may represent species or groups of species. Transitions between states are modelled using a stochastic matrix. A recent study showed how the time-reversal of such a Markov chain can be used to estimate the distribution of time since the last occurrence of some state of interest (such as empty space) at a point, given the current state of the point. However, if the underlying process operates in continuous time but is observed at regular intervals, this distribution describes the time since the last possible observation of the state of interest, rather than the time since its last occurrence. We show how to obtain the distribution of time since the last occurrence of a state of interest for a continuous-time homogeneous Markov chain. The expected time since the last occurrence of an initial state can be interpreted as a measure of the successional rank of a state. We show how to distinguish between different ways in which a state can have high successional rank. We apply our results to a marine subtidal community. 相似文献

15.

Hidden Markov Models and Animal Behaviour

Iain L. Macdonald David Raubenheimer 《Biometrical journal. Biometrische Zeitschrift》1995,37(6):701-712

This paper proposes the use of hidden Markov time series models for the analysis of the behaviour sequences of one or more animals under observation. These models have advantages over the Markov chain models commonly used for behaviour sequences, as they can allow for time-trend or expansion to several subjects without sacrificing parsimony. Furthermore, they provide an alternative to higher-order Markov chain models if a first-order Markov chain is unsatisfactory as a model. To illustrate the use of such models, we fit multivariate and univariate hidden Markov models allowing for time-trend to data from an experiment investigating the effects of feeding on the locomotory behaviour of locusts (Locusta migratoria). 相似文献

16.

The joint distribution of patterns in random sequences with application to the RC-measure for expressivity

Kleffe J.; Grau E. 《Bioinformatics (Oxford, England)》1993,9(3):275-283

A method previously developed for computation of pattern probabilitiesin random sequences under Markov chain models. We extend thismethod to the calculation of the joint distribution for twopatterns. An application yields the distribution of the rightchoice measure for expressivity and how significance boundsdepend on sequence length. These bounds are used to show thatthe choice of pyrimidine in codon position 3 of Escherichiacoli genes deviates considerably from a general Markov processmodel for coding regions. We also derive some statistical evidencethat this significant deviation is limited to codon position3. 相似文献

17.

Steady-state analysis of genetic regulatory networks modelled by probabilistic boolean networks 总被引：1，自引：0，他引：1

Shmulevich I Gluhovsky I Hashimoto RF Dougherty ER Zhang W 《Comparative and Functional Genomics》2003,4(6):601-608

Probabilistic Boolean networks (PBNs) have recently been introduced as a promising class of models of genetic regulatory networks. The dynamic behaviour of PBNs can be analysed in the context of Markov chains. A key goal is the determination of the steady-state (long-run) behaviour of a PBN by analysing the corresponding Markov chain. This allows one to compute the long-term influence of a gene on another gene or determine the long-term joint probabilistic behaviour of a few selected genes. Because matrix-based methods quickly become prohibitive for large sizes of networks, we propose the use of Monte Carlo methods. However, the rate of convergence to the stationary distribution becomes a central issue. We discuss several approaches for determining the number of iterations necessary to achieve convergence of the Markov chain corresponding to a PBN. Using a recently introduced method based on the theory of two-state Markov chains, we illustrate the approach on a sub-network designed from human glioma gene expression data and determine the joint steadystate probabilities for several groups of genes. 相似文献

18.

Analysing grouping of nucleotides in DNA sequences using lumped processes constructed from Markov chains

Guédon Y d'Aubenton-Carafa Y Thermes C 《Journal of mathematical biology》2006,52(3):343-372

The most commonly used models for analysing local dependencies in DNA sequences are (high-order) Markov chains. Incorporating knowledge relative to the possible grouping of the nucleotides enables to define dedicated sub-classes of Markov chains. The problem of formulating lumpability hypotheses for a Markov chain is therefore addressed. In the classical approach to lumpability, this problem can be formulated as the determination of an appropriate state space (smaller than the original state space) such that the lumped chain defined on this state space retains the Markov property. We propose a different perspective on lumpability where the state space is fixed and the partitioning of this state space is represented by a one-to-many probabilistic function within a two-level stochastic process. Three nested classes of lumped processes can be defined in this way as sub-classes of first-order Markov chains. These lumped processes enable parsimonious reparameterizations of Markov chains that help to reveal relevant partitions of the state space. Characterizations of the lumped processes on the original transition probability matrix are derived. Different model selection methods relying either on hypothesis testing or on penalized log-likelihood criteria are presented as well as extensions to lumped processes constructed from high-order Markov chains. The relevance of the proposed approach to lumpability is illustrated by the analysis of DNA sequences. In particular, the use of lumped processes enables to highlight differences between intronic sequences and gene untranslated region sequences. 相似文献

19.

Chains of opportunity: A Markov model for acquisition of reusable resources

Marc Weissburg Leonard Roseman Ivan D. Chase 《Evolutionary ecology》1991,5(2):105-117

Summary A vacancy chain is a unique type of resource acquisition process composed of an interconnected series of events in which the gaining of a particular resource unit by one individual depends directly on prior acquisition events by other individuals. Taken from the sociological literature, vacancy chains may also describe the distribution of many types of animal resources such as burrows, dwellings and shelters. Using data on hermit crabs, we present a Markov model simulating a vacancy chain process, and test the model against field data. Our results show that a simple Markov model adequately describes shell acquisition in hermit crabs, and that models combining shell size and crude estimates of quality fit the data extremely well. We illustrate in detail how to generate vacancy chain models from ecological data, how to determine the number and size of organisms gaining new resource units from resource introductions of specific sizes, and how to statistically evaluate the accuracy of Markov models. Not recognizing the presence of a vacancy chain system may lead to serious errors in estimating resource dynamics and therefore in demographic and competition models based on these dynamics. Finally, we suggest some ways in which vacancy chain models can aid studies of competition, population dynamics, life histories, and conservation in species using this type of resource acquisition process. 相似文献

20.

Modeling haplotype block variation using Markov chains

下载免费PDF全文

Greenspan G Geiger D 《Genetics》2006,172(4):2583-2599

Models of background variation in genomic regions form the basis of linkage disequilibrium mapping methods. In this work we analyze a background model that groups SNPs into haplotype blocks and represents the dependencies between blocks by a Markov chain. We develop an error measure to compare the performance of this model against the common model that assumes that blocks are independent. By examining data from the International Haplotype Mapping project, we show how the Markov model over haplotype blocks is most accurate when representing blocks in strong linkage disequilibrium. This contrasts with the independent model, which is rendered less accurate by linkage disequilibrium. We provide a theoretical explanation for this surprising property of the Markov model and relate its behavior to allele diversity. 相似文献