Genome-wide BAC-end sequencing of Musa acuminata DH Pahang reveals further insights into the genome organization of banana |
| |
Authors: | Rafael E Arango Roberto C Togawa Sebastien C Carpentier Nicolas Roux Bas L Hekkert Gert H J Kema Manoel T Souza Jr |
| |
Institution: | (1) Unidad de Biotecnolog?a Vegetal UNALMED-CIB, Corporaci?n para Investigaciones Biol?gicas (CIB), Carrera 72 A No. 78 B-141, Medell?n, Colombia;(2) Escuela de Biociencias, Facultad de Ciencias, Universidad Nacional, Carrera 64 Calle 65, Medell?n, Colombia;(3) Embrapa Genetic Resources & Biotechnology, CP 02372, CEP 70770-900 Bras?lia, Federal District, Brazil;(4) Division of Crop Biosystems, K.U.Leuven, 3001 Leuven, Belgium;(5) Global Musa Genomics Consortium, Bioversity International, Montpellier, France;(6) Plant Research International, 6708 PB Wageningen, The Netherlands;(7) Embrapa LABEX Europe, 6708 PB Wageningen, The Netherlands; |
| |
Abstract: | Banana and plantain (Musa spp.) are grown in more than 120 countries in tropical and subtropical regions and constitute an important staple food for
millions of people. A Musa acuminata ssp. malaccencis DH Pahang bacterial artificial chromosome (BAC) library (MAMB) was submitted for BAC-end sequencing. MAMB consists of 23,040
clones, with a 140-kbp average insert size, accounting for a five times coverage of the banana genome. A total of 46,080 reads
were generated, and 42,750 (92.8%) high-quality sequences were obtained after trimming for vector and quality. Analysis of
these data shows a GC content of 41.39%, whereas interspersed repeats comprise 32.3%. The most common repeated sequences found
show homology to ribosomal RNA genes, particularly 18S rRNA, while the Ty3/gypsy type monkey retrotransposon is the most common
retro element. The sequence data were used to generate a banana-specific repeat library containing 54 new repetitive elements
which accounted for 11.86% of the total nucleotides. Simple sequence repeats represent 0.7% of the sequence data and allowed
the identification of 2,455 potentially useful marker sites. Functional annotation identified 2,705 sequences that could code
for proteins of known function. Microsynteny analysis shows a higher number of co-linear matches to Oryza sativa, in contrast to Arabidopsis thaliana. This database of BAC-end sequences is useful for the assembly of the complete banana genome sequence and is important for
identification in functional genomics experiments. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|