首页 | 本学科首页   官方微博 | 高级检索  
     


A general method for fast multiple sequence alignment
Affiliation:1. Research Center for Interdisciplinary Studies on Structure Formation (RCSF), University of Bielefeld, Postfach 10 01 31, D-33501 Bielefeld, Germany;2. Department of Mathematics, Massey University, Palmerston North, New Zealand;1. McMaster University, Department of Materials Science and Engineering, Hamilton, Ontario, L8S 4M1, Canada;2. AFCC Automotive Fuel Cell Cooperation Corp., 9000 Glenlyon Parkway, Burnaby, British Columbia, BC, V5J 5J8, Canada;1. Laboratory of Pathology and Immunology of Aquatic Animals, KLMME, Ocean University of China, 5 Yushan Road, Qingdao 266003, China;2. Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, No.1 Wenhai Road, Aoshanwei Town, Jimo, Qingdao 266071, China;1. QOPNA, Mass Spectrometry Center, Department of Chemistry, University of Aveiro, 3810-193 Aveiro, Portugal;2. CEB – Centre of Biological Engineering, LIBRO – Laboratory of Research in Biofilms Rosário Oliveira, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal;3. Anatomia Patológica, Centro Hospitalar Baixo-Vouga, Avenida Artur Ravara, 3814-501 Aveiro, Portugal;4. IBMC – Instituto de Biologia Molecular e Celular, Rua do Campo Alegre 83, Porto, Portugal;5. ICBAS – Instituto de Ciências Biomédicas Abel Salazar, University of Porto, Rua de Rua de Jorge Viterbo Ferreira 228, 4050-313 Porto, Portugal;6. iBiMED, Institute for Biomedical Research, University of Aveiro, Aveiro, Portugal;1. NARO Hokkaido Agricultural Research Center, Hitsujigaoka-1, Toyohira-ku, Sapporo 062-8555, Japan;2. NARO Agricultural Research Center, 3-1-1 Kannondai, Tsukuba 305-8666, Japan;3. Saga University Center for Education and Research in Agricultural Innovation, Honjo-cho-1, Saga 840-8502, Japan
Abstract:We have developed a fast heuristic algorithm for multiple sequence alignment which provides near-to-optimal results for sufficiently homologous sequences. The algorithm makes use of the standard dynamic programming procedure by applying it to all pairs of sequences. The resulting score matrices for pair-wise alignment give rise to secondary matrices containing the additional charges imposed by forcing the alignment path to run through a particular vertex. Such a constraint corresponds to slicing the sequences at the positions defining that vertex, and aligning the remaining pairs of prefix and suffix sequences separately. From these secondary matrices, one can compute - for any given family of sequences - suitable positions for cutting all of these sequences simultaneously, thus reducing the problem of aligning a family of n sequences of average length l in a Divide and Conquer fashion to aligning two families of n sequences of approximately half that length.In this paper, we explain the method for the case of 3 sequences in detail, and we demonstrate its potential and its limits by discussing its behaviour for several test families. A generalization for aligning more than 3 sequences is lined out, and some actual alignments constructed by our algorithm for various user-defined parameters are presented.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号