期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Analysing grouping of nucleotides in DNA sequences using lumped processes constructed from Markov chains

Guédon Y d'Aubenton-Carafa Y Thermes C 《Journal of mathematical biology》2006,52(3):343-372

The most commonly used models for analysing local dependencies in DNA sequences are (high-order) Markov chains. Incorporating knowledge relative to the possible grouping of the nucleotides enables to define dedicated sub-classes of Markov chains. The problem of formulating lumpability hypotheses for a Markov chain is therefore addressed. In the classical approach to lumpability, this problem can be formulated as the determination of an appropriate state space (smaller than the original state space) such that the lumped chain defined on this state space retains the Markov property. We propose a different perspective on lumpability where the state space is fixed and the partitioning of this state space is represented by a one-to-many probabilistic function within a two-level stochastic process. Three nested classes of lumped processes can be defined in this way as sub-classes of first-order Markov chains. These lumped processes enable parsimonious reparameterizations of Markov chains that help to reveal relevant partitions of the state space. Characterizations of the lumped processes on the original transition probability matrix are derived. Different model selection methods relying either on hypothesis testing or on penalized log-likelihood criteria are presented as well as extensions to lumped processes constructed from high-order Markov chains. The relevance of the proposed approach to lumpability is illustrated by the analysis of DNA sequences. In particular, the use of lumped processes enables to highlight differences between intronic sequences and gene untranslated region sequences. 相似文献

2.

Cells coupled by voltage-dependent gap junctions: the asymptotic dynamical limit

Baigent S 《Bio Systems》2003,68(2-3):213-222

We study the steady state and dynamical properties of a pair of cells coupled by a voltage-dependent gap junction. The cells have linear membrane properties, and the gap junction is modelled using a simple Markov chain with a voltage-dependent transition matrix. We first show that the voltage-independent case is globally convergent using energy dissipation as a Lyapunov function for the cells, and standard results on the convergence of homogeneous Markov chains for the junction. For the voltage-dependent case, we use the difference in cell and gap junction time scales to reduce the coupled equations for cells and the gap junction to a single equation for the gap junction, but with a transition matrix that depends upon the current gap junction state. We identify cooperativity as key property behind the global convergence of Markov chains and investigate convergence of the voltage-dependent system by establishing some conditions under which cooperativity is preserved. 相似文献

3.

Coalescent process with fluctuating population size and its effective size 总被引：3，自引：0，他引：3

Sano A Shimizu A Iizuka M 《Theoretical population biology》2004,65(1):39-48

We consider a Wright-Fisher model whose population size is a finite Markov chain. We introduce a sequence of two-dimensional discrete time Markov chains whose components describe the coalescent process and the fluctuation of population size. For the limiting process of the sequence of Markov chains, the relationship of the expectation of coalescence time to the harmonic and the arithmetic means of population sizes is shown, and the Laplace transform of the distribution of coalescence time is calculated. We define the coalescence effective population size (cEPS) by the expectation of coalescence time. We show that cEPS is strictly larger (resp. smaller) than the harmonic (resp. arithmetic) mean. As the population size fluctuates more quickly (resp. slowly), cEPS is closer to the harmonic (resp. arithmetic) mean. For the case of a two-valued Markov chain, we show the explicit expression of cEPS and its dependency on the sample size. 相似文献

4.

Markovian approximation to the finite loci coalescent with recombination along multiple sequences

《Theoretical population biology》2014

The coalescent with recombination process has initially been formulated backwards in time, but simulation algorithms and inference procedures often apply along sequences. Therefore it is of major interest to approximate the coalescent with recombination process by a Markov chain along sequences. We consider the finite loci case and two or more sequences. We formulate a natural Markovian approximation for the tree building process along the sequences, and derive simple and analytically tractable formulae for the distribution of the tree at the next locus conditioned on the tree at the present locus. We compare our Markov approximation to other sequential Markov chains and discuss various applications. 相似文献

5.

Gaußsche Markovketten zweiter Ordnung

L. Berg 《Biometrical journal. Biometrische Zeitschrift》1984,26(3):279-288

We find the general form of the proper Gaussian Markov chains of second order and give an example for them. Comparing with the Gaussian Markov processes they can be used as an improved growth model. 相似文献

6.

On the integration of biotic interaction and environmental constraints at the biogeographical scale

下载免费PDF全文

Kévin Cazelles Nicolas Mouquet David Mouillot Dominique Gravel 《Ecography》2016,39(10):921-931

Biogeography is primarily concerned with the spatial distribution of biodiversity, including performing scenarios in a changing environment. The efforts deployed to develop species distribution models have resulted in predictive tools, but have mostly remained correlative and have largely ignored biotic interactions. Here we build upon the theory of island biogeography as a first approximation to the assembly dynamics of local communities embedded within a metacommunity context. We include all types of interactions and introduce environmental constraints on colonization and extinction dynamics. We develop a probabilistic framework based on Markov chains and derive probabilities for the realization of species assemblages, rather than single species occurrences. We consider the expected distribution of species richness under different types of ecological interactions. We also illustrate the potential of our framework by studying the interplay between different ecological requirements, interactions and the distribution of biodiversity along an environmental gradient. Our framework supports the idea that the future research in biogeography requires a coherent integration of several ecological concepts into a single theory in order to perform conceptual and methodological innovations, such as the switch from single‐species distribution to community distribution. 相似文献

7.

Transformations that preserve detailed balance in Markov models.

William J Bruno John E Pearson 《Journal of computational biology》2006,13(9):1574-1578

Aggregated Markov processes related by similarity transformation are equivalent in that they cannot be distinguished by steady-state experiments. We derive an explicit formula for the set of all detailed-balance preserving similarity transformations between such continuous time Markov chains with N states. The matrices that define the allowed similarity transformations are found to be a simple non-linear function applied to almost any element of the special orthogonal group in N dimensions. Since a model is identifiable only if there is no similarity transformations to an equivalent model, we expect this result to prove useful in the theory of identification of aggregated Markov chains, an enterprise of growing importance as more and more single molecules yield to observation. 相似文献

8.

Towards a unified framework for connectivity that disentangles movement and mortality in space and time

Robert J. Fletcher Jorge A. Sefair Chao Wang Caroline L. Poli Thomas A. H. Smith Emilio M. Bruna Robert D. Holt Michael Barfield Andrew J. Marx Miguel A. Acevedo 《Ecology letters》2019,22(10):1680-1689

Predicting connectivity, or how landscapes alter movement, is essential for understanding the scope for species persistence with environmental change. Although it is well known that movement is risky, connectivity modelling often conflates behavioural responses to the matrix through which animals disperse with mortality risk. We derive new connectivity models using random walk theory, based on the concept of spatial absorbing Markov chains. These models decompose the role of matrix on movement behaviour and mortality risk, can incorporate species distribution to predict the amount of flow, and provide both short‐ and long‐term analytical solutions for multiple connectivity metrics. We validate the framework using data on movement of an insect herbivore in 15 experimental landscapes. Our results demonstrate that disentangling the roles of movement behaviour and mortality risk is fundamental to accurately interpreting landscape connectivity, and that spatial absorbing Markov chains provide a generalisable and powerful framework with which to do so. 相似文献

9.

A representation of DNA primary sequences by random walk

Bai FL Liu YZ Wang TM 《Mathematical biosciences》2007,209(1):282-291

相似文献

10.

Steady-state analysis of genetic regulatory networks modelled by probabilistic boolean networks 总被引：1，自引：0，他引：1

Shmulevich I Gluhovsky I Hashimoto RF Dougherty ER Zhang W 《Comparative and Functional Genomics》2003,4(6):601-608

Probabilistic Boolean networks (PBNs) have recently been introduced as a promising class of models of genetic regulatory networks. The dynamic behaviour of PBNs can be analysed in the context of Markov chains. A key goal is the determination of the steady-state (long-run) behaviour of a PBN by analysing the corresponding Markov chain. This allows one to compute the long-term influence of a gene on another gene or determine the long-term joint probabilistic behaviour of a few selected genes. Because matrix-based methods quickly become prohibitive for large sizes of networks, we propose the use of Monte Carlo methods. However, the rate of convergence to the stationary distribution becomes a central issue. We discuss several approaches for determining the number of iterations necessary to achieve convergence of the Markov chain corresponding to a PBN. Using a recently introduced method based on the theory of two-state Markov chains, we illustrate the approach on a sub-network designed from human glioma gene expression data and determine the joint steadystate probabilities for several groups of genes. 相似文献

11.

Probability description of ligand-receptor interactions. Evaluation of reliability of events with small and supersmall doses. I. Kinetics of ligand-receptor interactions.

K G Gurevich S D Varfolomeev 《Biochemistry. Biokhimii?a》1999,64(9):1038-1048

相似文献

12.

Marathon: An Open Source Software Library for the Analysis of Markov-Chain Monte Carlo Algorithms

Steffen Rechner Annabell Berger 《PloS one》2016,11(1)

We present the software library marathon, which is designed to support the analysis of sampling algorithms that are based on the Markov-Chain Monte Carlo principle. The main application of this library is the computation of properties of so-called state graphs, which represent the structure of Markov chains. We demonstrate applications and the usefulness of marathon by investigating the quality of several bounding methods on four well-known Markov chains for sampling perfect matchings and bipartite graphs. In a set of experiments, we compute the total mixing time and several of its bounds for a large number of input instances. We find that the upper bound gained by the famous canonical path method is often several magnitudes larger than the total mixing time and deteriorates with growing input size. In contrast, the spectral bound is found to be a precise approximation of the total mixing time. 相似文献

13.

Markov chains: computing limit existence and approximations with DNA

Cardona M Colomer MA Conde J Miret JM Miró J Zaragoza A 《Bio Systems》2005,81(3):261-266

We present two algorithms to perform computations over Markov chains. The first one determines whether the sequence of powers of the transition matrix of a Markov chain converges or not to a limit matrix. If it does converge, the second algorithm enables us to estimate this limit. The combination of these algorithms allows the computation of a limit using DNA computing. In this sense, we have encoded the states and the transition probabilities using strands of DNA for generating paths of the Markov chain. 相似文献

14.

LD-SPatt: large deviations statistics for patterns on Markov chains.

G Nuel 《Journal of computational biology》2004,11(6):1023-1033

Statistics on Markov chains are widely used for the study of patterns in biological sequences. Statistics on these models can be done through several approaches. Central limit theorem (CLT) producing Gaussian approximations are one of the most popular ones. Unfortunately, in order to find a pattern of interest, these methods have to deal with tail distribution events where CLT is especially bad. In this paper, we propose a new approach based on the large deviations theory to assess pattern statistics. We first recall theoretical results for empiric mean (level 1) as well as empiric distribution (level 2) large deviations on Markov chains. Then, we present the applications of these results focusing on numerical issues. LD-SPatt is the name of GPL software implementing these algorithms. We compare this approach to several existing ones in terms of complexity and reliability and show that the large deviations are more reliable than the Gaussian approximations in absolute values as well as in terms of ranking and are at least as reliable as compound Poisson approximations. We then finally discuss some further possible improvements and applications of this new method. 相似文献

15.

First and second moment of counts of words in random texts generated by Markov chains 总被引：3，自引：0，他引：3

Kleffe J.; Borodovsky M. 《Bioinformatics (Oxford, England)》1992,8(5):433-441

An exact expression for the variance of random frequency thata given word has in text generated by a Markov chain is presented.The result is applied to periodic Markov chains, which describethe protein-coding DNA sequences better than simple Markov chains.A new solution to the problem of word overlap is proposed. Itwas found that the expected frequency and overlapping propertiesdetermine most of the variance. The expectation and varianceof counts for triplets are compared with experimental countsin Escherichia coli coding sequences. 相似文献

16.

Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo

Pagel M Meade A 《The American naturalist》2006,167(6):808-825

We describe a Bayesian method for investigating correlated evolution of discrete binary traits on phylogenetic trees. The method fits a continuous-time Markov model to a pair of traits, seeking the best fitting models that describe their joint evolution on a phylogeny. We employ the methodology of reversible-jump (RJ) Markov chain Monte Carlo to search among the large number of possible models, some of which conform to independent evolution of the two traits, others to correlated evolution. The RJ Markov chain visits these models in proportion to their posterior probabilities, thereby directly estimating the support for the hypothesis of correlated evolution. In addition, the RJ Markov chain simultaneously estimates the posterior distributions of the rate parameters of the model of trait evolution. These posterior distributions can be used to test among alternative evolutionary scenarios to explain the observed data. All results are integrated over a sample of phylogenetic trees to account for phylogenetic uncertainty. We implement the method in a program called RJ Discrete and illustrate it by analyzing the question of whether mating system and advertisement of estrus by females have coevolved in the Old World monkeys and great apes. 相似文献

17.

Evolution of probability measures by cellular automata on algebraic topological Markov chains

Maass A Martínez S 《Biological research》2003,36(1):113-118

In this paper we review some recent results on the evolution of probability measures under cellular automata acting on a fullshift. In particular we discuss the crucial role of the attractiveness of maximal measures. We enlarge the context of the results of a previous study of topological Markov chains that are Abelian groups; the shift map is an automorphism of this group. This is carried out by studying the dynamics of Markov measures by a particular additive cellular automata. Many of these topics were within the focus of Francisco Varela's mathematical interests. 相似文献

18.

Optimal choice of word length when comparing two Markov sequences using a <Emphasis Type="Italic">χ</Emphasis><Superscript>2</Superscript>-statistic

Xin Bai Kujin Tang Jie Ren Michael Waterman Fengzhu Sun 《BMC genomics》2017,18(6):732

Background

Alignment-free sequence comparison using counts of word patterns (grams, k-tuples) has become an active research topic due to the large amount of sequence data from the new sequencing technologies. Genome sequences are frequently modelled by Markov chains and the likelihood ratio test or the corresponding approximate χ ²-statistic has been suggested to compare two sequences. However, it is not known how to best choose the word length k in such studies.

Results

We develop an optimal strategy to choose k by maximizing the statistical power of detecting differences between two sequences. Let the orders of the Markov chains for the two sequences be r ₁ and r ₂, respectively. We show through both simulations and theoretical studies that the optimal k= max(r ₁,r ₂)+1 for both long sequences and next generation sequencing (NGS) read data. The orders of the Markov chains may be unknown and several methods have been developed to estimate the orders of Markov chains based on both long sequences and NGS reads. We study the power loss of the statistics when the estimated orders are used. It is shown that the power loss is minimal for some of the estimators of the orders of Markov chains.

Conclusion

Our studies provide guidelines on choosing the optimal word length for the comparison of Markov sequences.

相似文献

19.

Drifting Markov models with polynomial drift and applications to DNA sequences

Vergne N 《Statistical applications in genetics and molecular biology》2008,7(1):Article6

In this article, we introduce the drifting Markov models (DMMs) which are inhomogeneous Markov models designed for modeling the heterogeneities of sequences (in our case DNA or protein sequences) in a more flexible way than homogeneous Markov chains or even hidden Markov models (HMMs). We focus here on the polynomial drift: the transition matrix varies in a polynomial way. To show the reliability of our models on DNA, we exhibit high similarities between the probability distributions of nucleotides obtained by our models and the frequencies of these nucleotides computed by using a sliding window. In a further step, these DMMs can be used as the states of an HMM: on each of its segments, the observed process can be modeled by a drifting Markov model. Search of rare words in DNA sequences remains possible with DMMs and according to the fits provided, DMMs turn out to be a powerful tool for this purpose. The software is available on request from the author. It will soon be integrated on seq++ library (http://stat.genopole.cnrs.fr/seqpp/). 相似文献

20.

Statistical analysis of nucleotide sequences. 总被引：5，自引：4，他引：1

下载免费PDF全文

E E Stückle C Emmrich U Grob P J Nielsen 《Nucleic acids research》1990,18(22):6641-6647

In order to scan nucleic acid databases for potentially relevant but as yet unknown signals, we have developed an improved statistical model for pattern analysis of nucleic acid sequences by modifying previous methods based on Markov chains. We demonstrate the importance of selecting the appropriate parameters in order for the method to function at all. The model allows the simultaneous analysis of several short sequences with unequal base frequencies and Markov order k not equal to 0 as is usually the case in databases. As a test of these modifications, we show that in E. coli sequences there is a bias against palindromic hexamers which correspond to known restriction enzyme recognition sites. 相似文献