Identifying compositionally homogeneous and nonhomogeneous domains within the human genome using a novel segmentation algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Identifying compositionally homogeneous and nonhomogeneous domains within the human genome using a novel segmentation algorithm

Authors:	Eran Elhaik Dan Graur Kre?imir Josi? Giddy Landan

Institution:	1.McKusick - Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, 2.Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001 and 3.Department of Mathematics, University of Houston, Houston, TX, 77204-3008, USA

Abstract:	It has been suggested that the mammalian genome is composed mainly of long compositionally homogeneous domains. Such domains are frequently identified using recursive segmentation algorithms based on the Jensen–Shannon divergence. However, a common difficulty with such methods is deciding when to halt the recursive partitioning and what criteria to use in deciding whether a detected boundary between two segments is real or not. We demonstrate that commonly used halting criteria are intrinsically biased, and propose IsoPlotter, a parameter-free segmentation algorithm that overcomes such biases by using a simple dynamic halting criterion and tests the homogeneity of the inferred domains. IsoPlotter was compared with an alternative segmentation algorithm, D_JS, using two sets of simulated genomic sequences. Our results show that IsoPlotter was able to infer both long and short compositionally homogeneous domains with low GC content dispersion, whereas D_JS failed to identify short compositionally homogeneous domains and sequences with low compositional dispersion. By segmenting the human genome with IsoPlotter, we found that one-third of the genome is composed of compositionally nonhomogeneous domains and the remaining is a mixture of many short compositionally homogeneous domains and relatively few long ones.

Keywords:

设为首页 | 免责声明 | 关于勤云 | 加入收藏