Four basic symmetry types in the universal 7-cluster structure of microbial genomic sequences |
| |
Authors: | Gorban Alexander N Popova Tatyana G Zinovyev Andrei Y |
| |
Affiliation: | Institute of Computational Modeling, Russian Academy of Science, Russia. |
| |
Abstract: | Coding information is the main source of heterogeneity (non-randomness) in the sequences of microbial genomes. The heterogeneity corresponds to a cluster structure in triplet distributions of relatively short genomic fragments (200-400 bp). We found a universal 7-cluster structure in microbial genomic sequences and explained its properties. We show that codon usage of bacterial genomes is a multi-linear function of their genomic G+C-content with high accuracy. Based on the analysis of 143 completely sequenced bacterial genomes available in Genbank in August 2004, we show that there are four "pure" types of the 7-cluster structure observed. All 143 cluster animated 3D-scatters are collected in a database which is made available on our web-site (http://www.ihes.fr/~zinovyev/7clusters). The findings can be readily introduced into software for gene prediction, sequence alignment or microbial genomes classification. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|