Information theory reveals large-scale synchronisation of statistical correlations in eukaryote genomes |
| |
Authors: | Dehnert Manuel Helm Werner E Hütt Marc-Thorsten |
| |
Affiliation: | Bioinformatics Group, Department of Biology, Darmstadt University of Technology, D-64287 Darmstadt, Germany. |
| |
Abstract: | We study short-range correlations in DNA sequences with methods from information theory and statistics. We find a persisting degree of identity between the correlation patterns of different chromosomes of a species. Except for the case of human and chimpanzee inter-species differences in this correlation pattern allow robust species distinction: in a clustering tree based upon the correlation curves on the level of individual chromosomes distinct clusters for the individual species are found. This capacity of distinguishing species persists, even when the length of the underlying sequences is drastically reduced. In comparison to the standard tool for studying symbol correlations in DNA sequences, namely the mutual information function, we find that an autoregressive model for higher order Markov processes significantly improves species distinction due to an implicit subtraction of random background. |
| |
Keywords: | |
本文献已被 ScienceDirect PubMed 等数据库收录! |
|