首页 | 本学科首页   官方微博 | 高级检索  
   检索      


A whole-genome phylogeny of the family Pasteurellaceae
Authors:Maria Pia Di Bonaventura  Ernest K Lee  Rob DeSalle  Paul J Planet
Institution:1. Division of Forest Insect Pest and Diseases, National Institute of Forest Science, Seoul 02455, Republic of Korea;2. Southern Forest Resources Research Center, National Institute of Forest Science, Jinju 52817, Republic of Korea;3. Department of Ophthalmic Optics, Shinhan University, Gyeonggi 11644, Republic of Korea;4. Division of Wood Chemistry & Microbiology, National Institute of Forest Science, Seoul 02455, Republic of Korea;1. Servicio Antimicrobianos, National Reference Laboratory, Instituto Nacional de Enfermedades Infecciosas-ANLIS “Dr. Carlos G. Malbrán”, Argentina;2. Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Argentina;3. Hospital General de Niños “Dr. Ricardo Gutierrez”, Buenos Aires, Argentina
Abstract:A phylogenomic approach was used to generate an amino acid phylogeny for 12 whole genomes representing 10 species in the family Pasteurellaceae. Orthology of genes was determined using an approach similar to OrthologID (http://nypg.bio.nyu.edu/orthologid/about.html) and resulted in the generation of a matrix with 3130 genes with 1,194,615 aligned amino acid characters of which 239,504 characters are phylogenetically informative. Phylogenetic analysis of the concatenated matrix using all standard approaches (maximum parsimony, maximum likelihood, and Bayesian analysis) results in a single extremely robust phylogenetic hypothesis for the species examined in this study. Remarkably, no single gene partition gives the same tree as the concatenated analysis. By analyzing partitioned support in the data matrix, we show that there is very little negative support emanating from individual gene partitions to suggest that the concatenated hypothesis is not tenable. The large number of characters in the matrix allows us to test hypotheses concerning missing data and character number in phylogenomic studies, and we conclude that matrices constructed using genome level information are very robust to missing data. We show that a very large number of concatenated gene sequences (>160) are needed to reliably obtain the same topology as the overall analysis.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号