首页 | 本学科首页   官方微博 | 高级检索  
     


The Random Nature of Genome Architecture: Predicting Open Reading Frame Distributions
Authors:Michael W. McCoy  Andrew P. Allen  James F. Gillooly
Affiliation:1. Department of Biology, Boston University, Boston, Massachusetts, United States of America.; 2. Department of Biological Sciences, Macquarie University, Sydney, New South Wales, Australia.; 3. Department of Biology, University of Florida, Gainesville, Florida, United States of America.;Pasteur Institute, France
Abstract:

Background

A better understanding of the size and abundance of open reading frames (ORFS) in whole genomes may shed light on the factors that control genome complexity. Here we examine the statistical distributions of open reading frames (i.e. distribution of start and stop codons) in the fully sequenced genomes of 297 prokaryotes, and 14 eukaryotes.

Methodology/Principal Findings

By fitting mixture models to data from whole genome sequences we show that the size-frequency distributions for ORFS are strikingly similar across prokaryotic and eukaryotic genomes. Moreover, we show that i) a large fraction (60–80%) of ORF size-frequency distributions can be predicted a priori with a stochastic assembly model based on GC content, and that (ii) size-frequency distributions of the remaining “non-random” ORFs are well-fitted by log-normal or gamma distributions, and similar to the size distributions of annotated proteins.

Conclusions/Significance

Our findings suggest stochastic processes have played a primary role in the evolution of genome complexity, and that common processes govern the conservation and loss of functional genomics units in both prokaryotes and eukaryotes.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号