Patterns of tandem repetition in plant whole genome assemblies |
| |
Authors: | Rafael Navajas-Pérez Andrew H Paterson |
| |
Institution: | (1) Plant Genome Mapping Laboratory, University of Georgia, Athens, GA 30602, USA |
| |
Abstract: | Tandem repeats often confound large genome assemblies. A survey of tandemly arrayed repetitive sequences was carried out in
whole genome sequences of the green alga Chlamydomonas
reinhardtii, the moss Physcomitrella patens, the monocots rice and sorghum, and the dicots Arabidopsis thaliana, poplar, grapevine, and papaya, in order to test how these assemblies deal with this fraction of DNA. Our results suggest
that plant genome assemblies preferentially include tandem repeats composed of shorter monomeric units (especially dinucleotide
and 9–30-bp repeats), while higher repetitive units pose more difficulties to assemble. Nevertheless, notwithstanding that
currently available sequencing technologies struggle with higher arrays of repeated DNA, major well-known repetitive elements
including centromeric and telomeric repeats as well as high copy-number genes, were found to be reasonably well represented.
A database including all tandem repeat sequences characterized here was created to benefit future comparative genomic analyses. |
| |
Keywords: | Tandem repeats Whole genome assemblies |
本文献已被 SpringerLink 等数据库收录! |
|