首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Bayesian Inference of Genetic Parameters Based on Conditional Decompositions of Multivariate Normal Distributions
Authors:Jon Hallander  Patrik Waldmann  Chunkao Wang  Mikko J Sillanp??
Abstract:It is widely recognized that the mixed linear model is an important tool for parameter estimation in the analysis of complex pedigrees, which includes both pedigree and genomic information, and where mutually dependent genetic factors are often assumed to follow multivariate normal distributions of high dimension. We have developed a Bayesian statistical method based on the decomposition of the multivariate normal prior distribution into products of conditional univariate distributions. This procedure permits computationally demanding genetic evaluations of complex pedigrees, within the user-friendly computer package WinBUGS. To demonstrate and evaluate the flexibility of the method, we analyzed two example pedigrees: a large noninbred pedigree of Scots pine (Pinus sylvestris L.) that includes additive and dominance polygenic relationships and a simulated pedigree where genomic relationships have been calculated on the basis of a dense marker map. The analysis showed that our method was fast and provided accurate estimates and that it should therefore be a helpful tool for estimating genetic parameters of complex pedigrees quickly and reliably.MUCH effort in genetics has been devoted to revealing the underlying genetic architecture of quantitative or complex traits. Traditionally, the polygenic model has been used extensively to estimate genetic variances and breeding values of natural and breeding populations, where an infinite number of genes is assumed to code for the trait of interest (Bulmer 1971; Falconer and Mackay 1996). The genetic variance of a quantitative trait can be decomposed into an additive part that corresponds to the effects of individual alleles and a part that is nonadditive because of interactions between alleles. Attention has generally been focused on the estimation of additive genetic variance (and heritability), since additive variation is directly proportional to the response of selection via the breeder''s equation (Falconer and Mackay 1996, Chap. 11). However, to estimate additive genetic variation and heritability accurately, it can be important to identify potential nonadditive sources in genetic evaluations (Misztal 1997; Ovaskainen et al. 2008; Waldmann et al. 2008), especially if the pedigree being analyzed contains a large proportion of full-sibs and clones, as these in particular give rise to nonadditive genetic relationships (Lynch and Walsh 1998, pp. 145). The polygenic model using pedigree and phenotypic information, i.e., the animal model (Henderson 1984), has been the model of choice for estimating genetic parameters in breeding and natural populations (Abney et al. 2000; Sorensen and Gianola 2002; O′Hara et al. 2008).Recent breakthroughs in molecular techniques have made it possible to create genome-wide, single nucleotide polymorphism (SNP) maps. These maps have helped to uncover a vast amount of new loci responsible for trait expression and have provided general insights into the genetic architecture of quantitative traits (e.g., Valdar et al. 2006; Visscher 2008; Flint and Mackay 2009). These insights can help when calculating disease risks in humans, when attempting to increase the yield from breeding programs, and when estimating relatedness in conservation programs. High-density SNPs of many species of importance to science and agriculture can now be scored quickly and relatively cheaply, for example, in mice (Valdar et al. 2006), chickens (Muir et al. 2008), and dairy cattle (VanRaden et al. 2009).In the analysis of populations of breeding stock, the inclusion of dense marker data has improved the predictive ability (i.e., reliability) of genetic evaluations compared to the traditional phenotype model, both in simulations (Meuwissen et al. 2001; Calus et al. 2008; Hayes et al. 2009) and when using real data (Legarra et al. 2008; VanRaden et al. 2009; González-Recio et al. 2009). Meuwissen et al. (2001) suggested that the effect of all markers should first be estimated, and then summed, to obtain genomic estimated breeding values (GEBVs). An alternative procedure, where all markers are used to compute the genomic relationship matrix (in place of the additive polygenic relationship matrix) has also been suggested (e.g., Villanueva et al. 2005; VanRaden 2008; Hayes et al. 2009); this matrix is then incorporated into the statistical analysis to estimate GEBVs. A comparison of both procedures (VanRaden 2008) yielded similar estimates of GEBVs in cases where the effect of an individual allele was small. In addition, if not all pedigree members have marker information, a combined relationship matrix derived from both genotyped and ungenotyped individuals could be computed; this has been shown to increase the accuracy of GEBVs (Legarra et al. 2009; Misztal et al. 2009). Another plausible option to incorporate marker information is to use low-density SNP panels within families and to trace the effect of SNPs from high-density genotyped ancestors, as suggested by Habier et al. (2009) and Weigel et al. (2009). However, fast and powerful computer algorithms, which can use the marker information as efficiently as possible in the analysis of quantitative traits, are needed to obtain accurate GEBVs from genome-wide marker data.This study describes the development of an efficient Bayesian method for incorporating general relationships into the genetic evaluation procedure. The method is based on expressing the multivariate normal prior distribution as a product of one-dimensional normal distributions, each conditioned on the descending variables. When evaluating the genetic parameters of natural and breeding populations, high-dimensional distributions are often used as prior distributions of various genetic effects, such as the additive polygenic effect (Wang et al. 1993), multivariate additive polygenic effects (Van Tassell and Van Vleck 1996), and quantitative trait loci (QTL) effects via the identical-by-decent matrix (Yi and Xu 2000). A Bayesian framework is adopted to obtain posterior distributions of all unknown parameters, estimated by using Markov chain Monte Carlo (MCMC) sampling algorithms in the software package WinBUGS (Lunn et al. 2000, 2009). By performing prior calculations in the form of the factorized product of simple univariate conditional distributions, the computational time of the MCMC estimation procedure is reduced considerably. This feature permits rapid inference for both the polygenic model and the genomic relationship model. Moreover, the decomposition allows for inbreeding of varying degree, since the correct genetic covariance structure can be inferred into the analysis. In this article, we test the method on two previously published pedigree data sets: phenotype data from a large pedigree of Scots pine, incorporation of information on both additive and dominance genetic relationships (Waldmann et al. 2008); and genomic information obtained from a genome-wide scan of a simulated animal population (Lund et al. 2009).
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号