DAIRRy-BLUP: A High-Performance Computing Approach to Genomic Prediction期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

DAIRRy-BLUP: A High-Performance Computing Approach to Genomic Prediction

Authors:	Arne De Coninck Jan Fostier Steven Maenhout Bernard De Baets

Institution:	^*Research Unit Knowledge-based Systems KERMIT, Department of Mathematical Modelling, Statistics and Bioinformatics, Ghent University, B-9000 Ghent, Belgium;^†IBCN, Internet Based Communication Networks and Services Research Unit Department of Information Technology, Ghent University–iMinds, B-9000 Ghent, Belgium;^‡Progeno, B-9052 Zwijnaarde, Belgium

Abstract:	In genomic prediction, common analysis methods rely on a linear mixed-model framework to estimate SNP marker effects and breeding values of animals or plants. Ridge regression–best linear unbiased prediction (RR-BLUP) is based on the assumptions that SNP marker effects are normally distributed, are uncorrelated, and have equal variances. We propose DAIRRy-BLUP, a parallel, Distributed-memory RR-BLUP implementation, based on single-trait observations (y), that uses the Average Information algorithm for restricted maximum-likelihood estimation of the variance components. The goal of DAIRRy-BLUP is to enable the analysis of large-scale data sets to provide more accurate estimates of marker effects and breeding values. A distributed-memory framework is required since the dimensionality of the problem, determined by the number of SNP markers, can become too large to be analyzed by a single computing node. Initial results show that DAIRRy-BLUP enables the analysis of very large-scale data sets (up to 1,000,000 individuals and 360,000 SNPs) and indicate that increasing the number of phenotypic and genotypic records has a more significant effect on the prediction accuracy than increasing the density of SNP arrays.

Keywords:	genomic prediction high-performance computing distributed-memory architecture variance component estimation simulated data

设为首页 | 免责声明 | 关于勤云 | 加入收藏