首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Optical maps refine the bread wheat Triticum aestivum cv. Chinese Spring genome assembly
Authors:Tingting Zhu  Le Wang  Hélène Rimbert  Juan C Rodriguez  Karin R Deal  Romain De Oliveira  Frédéric Choulet  Gabriel Keeble-Gagnère  Josquin Tibbits  Jane Rogers  Kellye Eversole  Rudi Appels  Yong Q Gu  Martin Mascher  Jan Dvorak  Ming-Cheng Luo
Institution:1. Department of Plant Sciences, University of California, Davis, CA, 95616 USA

These authors contributed to this work.;2. GDEC, Université Clermont Auvergne, INRAE, Clermont-Ferrand, 63000 France

These authors contributed to this work.;3. Department of Plant Sciences, University of California, Davis, CA, 95616 USA;4. GDEC, Université Clermont Auvergne, INRAE, Clermont-Ferrand, 63000 France;5. Centre for AgriBioscience, Agriculture Victoria, AgriBio, Bundoora, VIC, 3083 Australia;6. International Wheat Genome Sequencing Consortium, Eau Claire, WI, 54701 USA;7. Centre for AgriBioscience, Agriculture Victoria, AgriBio, Bundoora, VIC, 3083 Australia

International Wheat Genome Sequencing Consortium, Eau Claire, WI, 54701 USA;8. Crop Improvement and Genetics Research Unit, USDA-ARS, Albany, CA, 94710 USA;9. Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany

Abstract:Until recently, achieving a reference-quality genome sequence for bread wheat was long thought beyond the limits of genome sequencing and assembly technology, primarily due to the large genome size and > 80% repetitive sequence content. The release of the chromosome scale 14.5-Gb IWGSC RefSeq v1.0 genome sequence of bread wheat cv. Chinese Spring (CS) was, therefore, a milestone. Here, we used a direct label and stain (DLS) optical map of the CS genome together with a prior nick, label, repair and stain (NLRS) optical map, and sequence contigs assembled with Pacific Biosciences long reads, to refine the v1.0 assembly. Inconsistencies between the sequence and maps were reconciled and gaps were closed. Gap filling and anchoring of 279 unplaced scaffolds increased the total length of pseudomolecules by 168 Mb (excluding Ns). Positions and orientations were corrected for 233 and 354 scaffolds, respectively, representing 10% of the genome sequence. The accuracy of the remaining 90% of the assembly was validated. As a result of the increased contiguity, the numbers of transposable elements (TEs) and intact TEs have increased in IWGSC RefSeq v2.1 compared with v1.0. In total, 98% of the gene models identified in v1.0 were mapped onto this new assembly through development of a dedicated approach implemented in the MAGAAT pipeline. The numbers of high-confidence genes on pseudomolecules have increased from 105 319 to 105 534. The reconciled assembly enhances the utility of the sequence for genetic mapping, comparative genomics, gene annotation and isolation, and more general studies on the biology of wheat.
Keywords:direct label and stain  pseudomolecule  transposable element  gene collinearity  Hi-C
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号