首页 | 本学科首页   官方微博 | 高级检索  
   检索      


The Genomes of Oryza sativa: a history of duplications
Authors:Yu Jun  Wang Jun  Lin Wei  Li Songgang  Li Heng  Zhou Jun  Ni Peixiang  Dong Wei  Hu Songnian  Zeng Changqing  Zhang Jianguo  Zhang Yong  Li Ruiqiang  Xu Zuyuan  Li Shengting  Li Xianran  Zheng Hongkun  Cong Lijuan  Lin Liang  Yin Jianning  Geng Jianing  Li Guangyuan  Shi Jianping  Liu Juan  Lv Hong  Li Jun  Wang Jing  Deng Yajun  Ran Longhua  Shi Xiaoli  Wang Xiyin  Wu Qingfa  Li Changfeng  Ren Xiaoyu  Wang Jingqiang  Wang Xiaoling  Li Dawei  Liu Dongyuan  Zhang Xiaowei  Ji Zhendong  Zhao Wenming  Sun Yongqiao  Zhang Zhenpeng  Bao Jingyue  Han Yujun  Dong Lingli  Ji Jia  Chen Peng  Wu Shuming  Liu Jinsong  Xiao Ying  Bu Dongbo  Tan Jianlong
Institution:Beijing Institute of Genomics of the Chinese Academy of Sciences, Beijing Genomics Institute, Beijing Proteomics Institute, China. junyu@genomics.org.cn
Abstract:We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000–40,000. Only 2%–3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family.
Keywords:
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号