首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 595 毫秒
1.
In Chile, an intensive Eucalyptus globulus clonal selection program is being carried out to increase forest productivity for pulp production. A breeding population was used to investigate the predicted ability of single nucleotide polymorphism (SNP) markers for genomic selection (GS). A total of 310 clones from 53 families were used. Stem volume and wood density were measured on all clones. Trees were genotyped at 12 K polymorphic markers using the EUChip60K genotype array. Genomic best linear unbiased prediction, Bayesian lasso regression, Bayes B, and Bayes C models were used to predict genomic estimated breeding values (GEBV). For cross-validation, 260 individuals were sampled for model training and 50 individuals for model validation, using 2 folds and 10 replications each. The average predictive ability estimates for wood density and stem volume across the models were 0.58 and 0.75, respectively. The average rank correlations were 0.59 and 0.71, respectively. Models produced very similar bias for both traits. When clones were ranked based on their GEBV, models had similar phenotypic mean for the top 10% of the clones. The predicted ability of markers will likely decrease if the models are used to predict GEBV of new material coming from the breeding program, because of a different marker–trait phase introduced by recombination. The results should be validated with larger populations and across two generations before routine applications of GS in E. globulus. We suggest that GS is a viable strategy to accelerate clonal selection program of E. globulus in Chile.  相似文献   

2.
Genomic selection (GS) has been implemented in animal and plant species, and is regarded as a useful tool for accelerating genetic gains. Varying levels of genomic prediction accuracy have been obtained in plants, depending on the prediction problem assessed and on several other factors, such as trait heritability, the relationship between the individuals to be predicted and those used to train the models for prediction, number of markers, sample size and genotype × environment interaction (GE). The main objective of this article is to describe the results of genomic prediction in International Maize and Wheat Improvement Center''s (CIMMYT''s) maize and wheat breeding programs, from the initial assessment of the predictive ability of different models using pedigree and marker information to the present, when methods for implementing GS in practical global maize and wheat breeding programs are being studied and investigated. Results show that pedigree (population structure) accounts for a sizeable proportion of the prediction accuracy when a global population is the prediction problem to be assessed. However, when the prediction uses unrelated populations to train the prediction equations, prediction accuracy becomes negligible. When genomic prediction includes modeling GE, an increase in prediction accuracy can be achieved by borrowing information from correlated environments. Several questions on how to incorporate GS into CIMMYT''s maize and wheat programs remain unanswered and subject to further investigation, for example, prediction within and between related bi-parental crosses. Further research on the quantification of breeding value components for GS in plant breeding populations is required.  相似文献   

3.
The ability to predict quantitative trait phenotypes from molecular polymorphism data will revolutionize evolutionary biology, medicine and human biology, and animal and plant breeding. Efforts to map quantitative trait loci have yielded novel insights into the biology of quantitative traits, but the combination of individually significant quantitative trait loci typically has low predictive ability. Utilizing all segregating variants can give good predictive ability in plant and animal breeding populations, but gives little insight into trait biology. Here, we used the Drosophila Genetic Reference Panel to perform both a genome wide association analysis and genomic prediction for the fitness-related trait chill coma recovery time. We found substantial total genetic variation for chill coma recovery time, with a genetic architecture that differs between males and females, a small number of molecular variants with large main effects, and evidence for epistasis. Although the top additive variants explained 36% (17%) of the genetic variance among lines in females (males), the predictive ability using genomic best linear unbiased prediction and a relationship matrix using all common segregating variants was very low for females and zero for males. We hypothesized that the low predictive ability was due to the mismatch between the infinitesimal genetic architecture assumed by the genomic best linear unbiased prediction model and the true genetic architecture of chill coma recovery time. Indeed, we found that the predictive ability of the genomic best linear unbiased prediction model is markedly improved when we combine quantitative trait locus mapping with genomic prediction by only including the top variants associated with main and epistatic effects in the relationship matrix. This trait-associated prediction approach has the advantage that it yields biologically interpretable prediction models.  相似文献   

4.
Genomic Selection (GS) is a new breeding method in which genome-wide markers are used to predict the breeding value of individuals in a breeding population. GS has been shown to improve breeding efficiency in dairy cattle and several crop plant species, and here we evaluate for the first time its efficacy for breeding inbred lines of rice. We performed a genome-wide association study (GWAS) in conjunction with five-fold GS cross-validation on a population of 363 elite breeding lines from the International Rice Research Institute''s (IRRI) irrigated rice breeding program and herein report the GS results. The population was genotyped with 73,147 markers using genotyping-by-sequencing. The training population, statistical method used to build the GS model, number of markers, and trait were varied to determine their effect on prediction accuracy. For all three traits, genomic prediction models outperformed prediction based on pedigree records alone. Prediction accuracies ranged from 0.31 and 0.34 for grain yield and plant height to 0.63 for flowering time. Analyses using subsets of the full marker set suggest that using one marker every 0.2 cM is sufficient for genomic selection in this collection of rice breeding materials. RR-BLUP was the best performing statistical method for grain yield where no large effect QTL were detected by GWAS, while for flowering time, where a single very large effect QTL was detected, the non-GS multiple linear regression method outperformed GS models. For plant height, in which four mid-sized QTL were identified by GWAS, random forest produced the most consistently accurate GS models. Our results suggest that GS, informed by GWAS interpretations of genetic architecture and population structure, could become an effective tool for increasing the efficiency of rice breeding as the costs of genotyping continue to decline.  相似文献   

5.
Zhu C  Zhang R 《Heredity》2007,98(6):401-410
The triple test cross (TTC) is an experimental design for detecting epistasis and estimating the components of genetic variance for quantitative traits. In this paper, we extend the analysis to include molecular information. The statistical power of the mating design was assessed under a model assuming that a finite number of loci affect the trait in question. Formulae are developed for the analysis with or without marker information relating to the recombination fraction between loci, the genetical properties of quantitative trait controlled by the quantitative trait loci (QTL), the linkage phases of the parents and population size. Application of these formulae showed that the recombination fraction between genes and the magnitude and the types of epistasis have important interactions in their effects on power. The results demonstrate that the TTC may have increased power to detect epistasis when marker information is present. However, the simulation experiments show that the standard deviation of the estimated expected mean square was higher with one marker than that with two, whereas the corresponding value without marker information was the lowest. In addition, we demonstrate that the relative position of QTL and markers and the number of markers can both affect the power of epistatic detection.  相似文献   

6.
Genomic selection can increase genetic gain per generation through early selection. Genomic selection is expected to be particularly valuable for traits that are costly to phenotype and expressed late in the life cycle of long-lived species. Alternative approaches to genomic selection prediction models may perform differently for traits with distinct genetic properties. Here the performance of four different original methods of genomic selection that differ with respect to assumptions regarding distribution of marker effects, including (i) ridge regression-best linear unbiased prediction (RR-BLUP), (ii) Bayes A, (iii) Bayes Cπ, and (iv) Bayesian LASSO are presented. In addition, a modified RR-BLUP (RR-BLUP B) that utilizes a selected subset of markers was evaluated. The accuracy of these methods was compared across 17 traits with distinct heritabilities and genetic architectures, including growth, development, and disease-resistance properties, measured in a Pinus taeda (loblolly pine) training population of 951 individuals genotyped with 4853 SNPs. The predictive ability of the methods was evaluated using a 10-fold, cross-validation approach, and differed only marginally for most method/trait combinations. Interestingly, for fusiform rust disease-resistance traits, Bayes Cπ, Bayes A, and RR-BLUB B had higher predictive ability than RR-BLUP and Bayesian LASSO. Fusiform rust is controlled by few genes of large effect. A limitation of RR-BLUP is the assumption of equal contribution of all markers to the observed variation. However, RR-BLUP B performed equally well as the Bayesian approaches.The genotypic and phenotypic data used in this study are publically available for comparative analysis of genomic selection prediction models.  相似文献   

7.
Crop improvement is a long-term, expensive institutional endeavor. Genomic selection (GS), which uses single nucleotide polymorphism (SNP) information to estimate genomic breeding values, has proven efficient to increasing genetic gain by accelerating the breeding process in animal breeding programs. As for crop improvement, with few exceptions, GS applicability remains in the evaluation of algorithm performance. In this study, we examined factors related to GS applicability in line development stage for grain yield using a hard red winter wheat (Triticum aestivum L.) doubled-haploid population. The performance of GS was evaluated in two consecutive years to predict grain yield. In general, the semi-parametric reproducing kernel Hilbert space prediction algorithm outperformed parametric genomic best linear unbiased prediction. For both parametric and semi-parametric algorithms, an upward bias in predictability was apparent in within-year cross-validation, suggesting the prerequisite of cross-year validation for a more reliable prediction. Adjusting the training population’s phenotype for genotype by environment effect had a positive impact on GS model’s predictive ability. Possibly due to marker redundancy, a selected subset of SNPs at an absolute pairwise correlation coefficient threshold value of 0.4 produced comparable results and reduced the computational burden of considering the full SNP set. Finally, in the context of an ongoing breeding and selection effort, the present study has provided a measure of confidence based on the deviation of line selection from GS results, supporting the implementation of GS in wheat variety development.  相似文献   

8.
Due to their long reproductive cycles and the time to expression of mature traits, marker-assisted selection is particularly attractive for tree breeding. In this review, we discuss different approaches used for developing markers and propose a method for application of markers in low linkage disequilibrium (LD) populations. Identification of useful markers for application in tree breeding is mainly based on two approaches, quantitative trait locus (QTL) mapping and association genetic studies. While several studies have identified significant markers, effect of the individual markers is low making it difficult to utilize them in breeding programs. Recently, genomic selection (GS) was proposed for overcoming some of these difficulties. In GS, high density markers are used for predicting phenotypes from genotypes. Currently small effective populations with high LD are being tested for GS in tree breeding. For wider application, GS needs to be applied in low LD populations which are found in many tree breeding programs. Here we propose an approach in which the significant markers from association studies may be used for developing prediction models in low LD populations using the same methods as in GS. Preliminary analyses indicate that a modest numbers of markers may be sufficient for developing prediction models in low LD populations. GS based on large numbers of random markers or small numbers of associated markers is poised to make marker-assisted selection a reality in forest tree breeding.  相似文献   

9.
? Genomic selection (GS) is expected to cause a paradigm shift in tree breeding by improving its speed and efficiency. By fitting all the genome-wide markers concurrently, GS can capture most of the 'missing heritability' of complex traits that quantitative trait locus (QTL) and association mapping classically fail to explain. Experimental support of GS is now required. ? The effectiveness of GS was assessed in two unrelated Eucalyptus breeding populations with contrasting effective population sizes (N(e) = 11 and 51) genotyped with > 3000 DArT markers. Prediction models were developed for tree circumference and height growth, wood specific gravity and pulp yield using random regression best linear unbiased predictor (BLUP). ? Accuracies of GS varied between 0.55 and 0.88, matching the accuracies achieved by conventional phenotypic selection. Substantial proportions (74-97%) of trait heritability were captured by fitting all genome-wide markers simultaneously. Genomic regions explaining trait variation largely coincided between populations, although GS models predicted poorly across populations, likely as a result of variable patterns of linkage disequilibrium, inconsistent allelic effects and genotype × environment interaction. ? GS brings a new perspective to the understanding of quantitative trait variation in forest trees and provides a revolutionary tool for applied tree improvement. Nevertheless population-specific predictive models will likely drive the initial applications of GS in forest tree breeding.  相似文献   

10.
Gardner Syndrome (GS) is an autosomal dominant variant of colorectal polyposis with essentially complete penetrance. It is distinguished from the other polyposis syndromes by its delayed age at onset, the number of polyps, and its extracolonic manifestations. The presence of epidermal cysts, bony osteomata, desmoid tumors, and dental anomalies are distinguishing features of this syndrome. Recently, multiple and bilateral patches of congenital hypertrophy of the retinal pigment epithelium (CHRPE) have been described in three families with classical GS. Tight linkage of the GS and CHRPE phenotypes (Z = 9.752; theta = 0) suggested that CHRPE is a pleiotropic effect of the Gardner mutation within the families in which the ophthalmic trait occurs and is thus a useful marker for the early detection of GS gene carriers. We have analyzed six new families segregating for classic GS and CHRPE. Linkage was tested between GS and CHRPE and between these two phenotypes and a battery of 22 informative biochemical and serological markers. We have extended the linkage analysis on two GS-CHRPE families originally reported elsewhere. Linkage between GS and CHRPE at theta = 0 was observed in all families, a result supporting our original suggestion that CHRPE is a congenital manifestation of the GS mutation. Exclusionary linkage data presented confirm that, for linkage analysis in these families, the CHRPE phenotype is a more powerful marker than other phenotypic features of GS.  相似文献   

11.
Selective breeding is a common and effective approach for genetic improvement of aquaculture stocks with parental selection as the key factor. Genomic selection (GS) has been proposed as a promising tool to facilitate selective breeding. Here, we evaluated the predictability of four GS methods in Zhikong scallop (Chlamys farreri) through real dataset analyses of four economical traits (e.g., shell length, shell height, shell width, and whole weight). Our analysis revealed that different GS models exhibited variable performance in prediction accuracy depending on genetic and statistical factors, but non-parametric method, including reproducing kernel Hilbert spaces regression (RKHS) and sparse neural networks (SNN), generally outperformed parametric linear method, such as genomic best linear unbiased prediction (GBLUP) and BayesB. Furthermore, we demonstrated that the predictability relied mainly on the heritability regardless of GS methods. The size of training population and marker density also had considerable effects on the predictive performance. In practice, increasing the training population size could better improve the genomic prediction than raising the marker density. This study is the first to apply non-linear model and neural networks for GS in scallop and should be valuable to help develop strategies for aquaculture breeding programs.  相似文献   

12.
13.
14.

Key message

We compare genomic selection methods that use correlated traits to help predict biomass yield in sorghum, and find that trait-assisted genomic selection performs best.

Abstract

Genomic selection (GS) is usually performed on a single trait, but correlated traits can also help predict a focal trait through indirect or multi-trait GS. In this study, we use a pre-breeding population of biomass sorghum to compare strategies that use correlated traits to improve prediction of biomass yield, the focal trait. Correlated traits include moisture, plant height measured at monthly intervals between planting and harvest, and the area under the growth progress curve. In addition to single- and multi-trait direct and indirect GS, we test a new strategy called trait-assisted GS, in which correlated traits are used along with marker data in the validation population to predict a focal trait. Single-trait GS for biomass yield had a prediction accuracy of 0.40. Indirect GS performed best using area under the growth progress curve to predict biomass yield, with a prediction accuracy of 0.37, and did not differ from indirect multi-trait GS that also used moisture information. Multi-trait GS and single-trait GS yielded similar results, indicating that correlated traits did not improve prediction of biomass yield in a standard GS scenario. However, trait-assisted GS increased prediction accuracy by up to \(50\%\) when using plant height in both the training and validation populations to help predict yield in the validation population. Coincidence between selected genotypes in phenotypic and genomic selection was also highest in trait-assisted GS. Overall, these results suggest that trait-assisted GS can be an efficient strategy when correlated traits are obtained earlier or more inexpensively than a focal trait.
  相似文献   

15.
We compared the accuracies of four genomic-selection prediction methods as affected by marker density, level of linkage disequilibrium (LD), quantitative trait locus (QTL) number, sample size, and level of replication in populations generated from multiple inbred lines. Marker data on 42 two-row spring barley inbred lines were used to simulate high and low LD populations from multiple inbred line crosses: the first included many small full-sib families and the second was derived from five generations of random mating. True breeding values (TBV) were simulated on the basis of 20 or 80 additive QTL. Methods used to derive genomic estimated breeding values (GEBV) were random regression best linear unbiased prediction (RR–BLUP), Bayes-B, a Bayesian shrinkage regression method, and BLUP from a mixed model analysis using a relationship matrix calculated from marker data. Using the best methods, accuracies of GEBV were comparable to accuracies from phenotype for predicting TBV without requiring the time and expense of field evaluation. We identified a trade-off between a method's ability to capture marker-QTL LD vs. marker-based relatedness of individuals. The Bayesian shrinkage regression method primarily captured LD, the BLUP methods captured relationships, while Bayes-B captured both. Under most of the study scenarios, mixed-model analysis using a marker-derived relationship matrix (BLUP) was more accurate than methods that directly estimated marker effects, suggesting that relationship information was more valuable than LD information. When markers were in strong LD with large-effect QTL, or when predictions were made on individuals several generations removed from the training data set, however, the ranking of method performance was reversed and BLUP had the lowest accuracy.  相似文献   

16.
It has long been recognized that epistasis or interactions between non-allelic genes plays an important role in the genetic control and evolution of quantitative traits. However, the detection of epistasis and estimation of epistatic effects are difficult due to the complexity of epistatic patterns, insufficient sample size of mapping populations and lack of efficient statistical methods. Under the assumption of additivity of QTL effects on the phenotype of a trait in interest, the additive effect of a QTL can be completely absorbed by the flanking marker variables, and the epistatic effect between two QTL can be completely absorbed by the four marker-pair multiplication variables between the two pairs of flanking markers. Based on this property, we proposed an inclusive composite interval mapping (ICIM) by simultaneously considering marker variables and marker-pair multiplications in a linear model. Stepwise regression was applied to identify the most significant markers and marker-pair multiplications. Then a two-dimensional scanning (or interval mapping) was conducted to identify QTL with significant digenic epistasis using adjusted phenotypic values based on the best multiple regression model. The adjusted values retain the information of QTL on the two current mapping intervals but exclude the influence of QTL on other intervals and chromosomes. Epistatic QTL can be identified by ICIM, no matter whether the two interacting QTL have any additive effects. Simulated populations and one barley doubled haploids (DH) population were used to demonstrate the efficiency of ICIM in mapping both additive QTL and digenic interactions.  相似文献   

17.
Yu R  Shete S 《BMC genetics》2005,6(Z1):S136
A supervised learning method, support vector machine, was used to analyze the microsatellite marker dataset of the Collaborative Study on the Genetics of Alcoholism Problem 1 for the Genetic Analysis Workshop 14. Twelve binary-valued phenotype variables were chosen for analyses using the markers from all autosomal chromosomes. Using various polynomial kernel functions of the support vector machine and randomly divided genome regions, we were able to observe the association of some marker sets with the chosen phenotypes and thus reduce the size of the dataset. The successful classifications established with the chosen support vector machine kernel function had high levels of correctness for each prediction, e.g., 96% in the fourfold cross-validations. However, owing to the limited sample data, we were not able to test the predictions of the classifiers in the new sample data.  相似文献   

18.
Long N  Gianola D  Rosa GJ  Weigel KA 《Genetica》2011,139(7):843-854
It has become increasingly clear from systems biology arguments that interaction and non-linearity play an important role in genetic regulation of phenotypic variation for complex traits. Marker-assisted prediction of genetic values assuming additive gene action has been widely investigated because of its relevance in artificial selection. On the other hand, it has been less well-studied when non-additive effects hold. Here, we explored a nonparametric model, radial basis function (RBF) regression, for predicting quantitative traits under different gene action modes (additivity, dominance and epistasis). Using simulation, it was found that RBF had better ability (higher predictive correlations and lower predictive mean square errors) of predicting merit of individuals in future generations in the presence of non-additive effects than a linear additive model, the Bayesian Lasso. This was true for populations undergoing either directional or random selection over several generations. Under additive gene action, RBF was slightly worse than the Bayesian Lasso. While prediction of genetic values under additive gene action is well handled by a variety of parametric models, nonparametric RBF regression is a useful counterpart for dealing with situations where non-additive gene action is suspected, and it is robust irrespective of mode of gene action.  相似文献   

19.
Genomic selection (GS) is a method for predicting breeding values of plants or animals using many molecular markers that is commonly implemented in two stages. In plant breeding the first stage usually involves computation of adjusted means for genotypes which are then used to predict genomic breeding values in the second stage. We compared two classical stage-wise approaches, which either ignore or approximate correlations among the means by a diagonal matrix, and a new method, to a single-stage analysis for GS using ridge regression best linear unbiased prediction (RR-BLUP). The new stage-wise method rotates (orthogonalizes) the adjusted means from the first stage before submitting them to the second stage. This makes the errors approximately independently and identically normally distributed, which is a prerequisite for many procedures that are potentially useful for GS such as machine learning methods (e.g. boosting) and regularized regression methods (e.g. lasso). This is illustrated in this paper using componentwise boosting. The componentwise boosting method minimizes squared error loss using least squares and iteratively and automatically selects markers that are most predictive of genomic breeding values. Results are compared with those of RR-BLUP using fivefold cross-validation. The new stage-wise approach with rotated means was slightly more similar to the single-stage analysis than the classical two-stage approaches based on non-rotated means for two unbalanced datasets. This suggests that rotation is a worthwhile pre-processing step in GS for the two-stage approaches for unbalanced datasets. Moreover, the predictive accuracy of stage-wise RR-BLUP was higher (5.0–6.1 %) than that of componentwise boosting.  相似文献   

20.
We revisited, in a genomic context, the theory of hybrid genetic evaluation models of hybrid crosses of pure lines, as the current practice is largely based on infinitesimal model assumptions. Expressions for covariances between hybrids due to additive substitution effects and dominance and epistatic deviations were analytically derived. Using dense markers in a GBLUP analysis, it is possible to split specific combining ability into dominance and across-groups epistatic deviations, and to split general combining ability (GCA) into within-line additive effects and within-line additive by additive (and higher order) epistatic deviations. We analyzed a publicly available maize data set of Dent × Flint hybrids using our new model (called GCA-model) up to additive by additive epistasis. To model higher order interactions within GCAs, we also fitted “residual genetic” line effects. Our new GCA-model was compared with another genomic model which assumes a uniquely defined effect of genes across origins. Most variation in hybrids is accounted by GCA. Variances due to dominance and epistasis have similar magnitudes. Models based on defining effects either differently or identically across heterotic groups resulted in similar predictive abilities for hybrids. The currently used model inflates the estimated additive genetic variance. This is not important for hybrid predictions but has consequences for the breeding scheme—e.g. overestimation of the genetic gain within heterotic group. Therefore, we recommend using GCA-model, which is appropriate for genomic prediction and variance component estimation in hybrid crops using genomic data, and whose results can be practically interpreted and used for breeding purposes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号