首页 | 本学科首页   官方微博 | 高级检索  
     


An overview on the distribution of word counts in Markov chains.
Authors:S Schbath
Affiliation:Institut National de la Recherche Agronomique, Unité de Biométrie, Jouy-en-Josas, France. Sophie.Schbath@jouy.inra.fr
Abstract:In this paper, we give an overview about the different results existing on the statistical distribution of word counts in a Markovian sequence of letters. Results concerning the number of overlapping occurrences, the number of renewals and the number of clumps will be presented. Counts of single words and also multiple words are considered. Most of the results are approximations as the length of the sequence tends to infinity. We will see that Gaussian approximations switch to (compound) Poisson approximations for rare words. Modeling DNA sequences or proteins by stationary Markov chains, these results can be used to study the statistical frequency of motifs in a given sequence.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号