首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Marker imputation efficiency for genotyping-by-sequencing data in rice (<Emphasis Type="Italic">Oryza sativa</Emphasis>) and alfalfa (<Emphasis Type="Italic">Medicago sativa</Emphasis>)
Authors:Nelson Nazzicari  Filippo Biscarini  Paolo Cozzi  E Charles Brummer  Paolo Annicchiarico
Institution:1.Council for Agricultural Research and Economics (CREA) Research Centre for Fodder Crops and Dairy Productions,Lodi,Italy;2.Dipartimento di Bioinformatica,Fondazione Parco Tecnologico Padano,Lodi,Italy;3.Plant Sciences Department,University of California,Davis,USA
Abstract:Genotyping-by-sequencing (GBS) is a rapid and cost-effective genome-wide genotyping technique applicable whether a reference genome is available or not. Due to the cost-coverage trade-off, however, GBS typically produces large amounts of missing marker genotypes, whose imputation becomes therefore both challenging and critical for later analyses. In this work, the performance of four general imputation methods (K-nearest neighbors, Random Forest, singular value decomposition, and mean value) and two genotype-specific methods (“Beagle” and FILLIN) was measured on GBS data from alfalfa (Medicago sativa L., autotetraploid, heterozygous, without reference genome) and rice (Oryza sativa L., diploid, 100 % homozygous, with reference genome). Alfalfa SNP were aligned on the genome of the closely related species Medicago truncatula L.. Benchmarks consisted in progressive data filtering for marker call rate (up to 70 %) and increasing proportions (up to 20 %) of known genotypes masked for imputation. The relative performance was measured as the total proportion of correctly imputed genotypes, globally and within each genotype class (two homozygotes in rice, two homozygotes and one heterozygote in alfalfa). We found that imputation accuracy was robust to increasing missing rates, and consistently higher in rice than in alfalfa. Accuracy was as high as 90–100 % for the major (most frequent) homozygous genotype, but dropped to 80–90 % (rice) and below 30 % (alfalfa) in the minor homozygous genotype. Beagle was the best performing method, both accuracy- and time-wise, in rice. In alfalfa, KNNI and RFI gave the highest accuracies, but KNNI was much faster.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号