Coordinate systems for supergenomes |
| |
Authors: | Fabian Gärtner Christian Höner zu Siederdissen Lydia Müller " target="_blank">Peter F Stadler |
| |
Institution: | 1.Competence Center for Scalable Data Services and Solutions Dresden/Leipzig,Universit?t Leipzig,Leipzig,Germany;2.Bioinformatics Group, Department of Computer Science,Universit?t Leipzig,Leipzig,Germany;3.Interdisciplinary Center for Bioinformatics,Universit?t Leipzig,Leipzig,Germany;4.Automatic Language Processing Group, Department of Computer Science,Universit?t Leipzig,Leipzig,Germany;5.Max Planck Institute for Mathematics in the Sciences,Leipzig,Germany;6.Department of Theoretical Chemistry,University of Vienna,Vienna,Austria;7.Center for non-coding RNA in Technology and Health,Frederiksberg C,Denmark;8.Santa Fe Institute,Santa Fe,USA |
| |
Abstract: | BackgroundGenome sequences and genome annotation data have become available at ever increasing rates in response to the rapid progress in sequencing technologies. As a consequence the demand for methods supporting comparative, evolutionary analysis is also growing. In particular, efficient tools to visualize-omics data simultaneously for multiple species are sorely lacking. A first and crucial step in this direction is the construction of a common coordinate system. Since genomes not only differ by rearrangements but also by large insertions, deletions, and duplications, the use of a single reference genome is insufficient, in particular when the number of species becomes large.ResultsThe computational problem then becomes to determine an order and orientations of optimal local alignments that are as co-linear as possible with all the genome sequences. We first review the most prominent approaches to model the problem formally and then proceed to showing that it can be phrased as a particular variant of the Betweenness Problem. It is NP hard in general. As exact solutions are beyond reach for the problem sizes of practical interest, we introduce a collection of heuristic simplifiers to resolve ordering conflicts.ConclusionBenchmarks on real-life data ranging from bacterial to fly genomes demonstrate the feasibility of computing good common coordinate systems. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|