Clustering of protein domains in the human genome |
| |
Authors: | Mayor Lianne R Fleming Keiran P Müller Arne Balding David J Sternberg Michael J E |
| |
Affiliation: | Department of Epidemiology and Public Health, Imperial College, St Mary's Campus, London W2 1PG, UK. |
| |
Abstract: | We present a systematic study of the clustering of genes within the human genome based on homology inferred from both sequence and structural similarity. The 3D-Genomics automated proteome annotation pipeline () was utilised to infer homology for each protein domain in the genome, for the 26 superfamilies most highly represented in the Structural Classification Of Proteins (SCOP) database. This approach enabled us to identify homologues that could not be detected by sequence-based methods alone. For each superfamily, we investigated the distribution, both within and among chromosomes, of genes encoding at least one domain within the superfamily. The results indicate a diversity of clustering behaviours: some superfamilies showed no evidence of any clustering, and others displayed significant clustering either within or among chromosomes, or both. Removal of tandem repeats reduced the levels of clustering observed, but some superfamilies still displayed highly significant clustering. Thus, our study suggests that either the process of gene duplication, or the evolution of the resulting clusters, differs between structural superfamilies. |
| |
Keywords: | tandem repeats gene clustering protein domains genome evolution bioinformatics |
本文献已被 ScienceDirect PubMed 等数据库收录! |
|