首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Maize (Zea mays L.) serves as model plant for heterosis research and is the crop where hybrid breeding was pioneered. We analyzed genomic and phenotypic data of 1254 hybrids of a typical maize hybrid breeding program based on the important Dent × Flint heterotic pattern. Our main objectives were to investigate genome properties of the parental lines (e.g., allele frequencies, linkage disequilibrium, and phases) and examine the prospects of genomic prediction of hybrid performance. We found high consistency of linkage phases and large differences in allele frequencies between the Dent and Flint heterotic groups in pericentromeric regions. These results can be explained by the Hill–Robertson effect and support the hypothesis of differential fixation of alleles due to pseudo-overdominance in these regions. In pericentromeric regions we also found indications for consistent marker–QTL linkage between heterotic groups. With prediction methods GBLUP and BayesB, the cross-validation prediction accuracy ranged from 0.75 to 0.92 for grain yield and from 0.59 to 0.95 for grain moisture. The prediction accuracy of untested hybrids was highest, if both parents were parents of other hybrids in the training set, and lowest, if none of them were involved in any training set hybrid. Optimizing the composition of the training set in terms of number of lines and hybrids per line could further increase prediction accuracy. We conclude that genomic prediction facilitates a paradigm shift in hybrid breeding by focusing on the performance of experimental hybrids rather than the performance of parental lines in testcrosses.  相似文献   

2.
In plant and animal breeding studies a distinction is made between the genetic value (additive plus epistatic genetic effects) and the breeding value (additive genetic effects) of an individual since it is expected that some of the epistatic genetic effects will be lost due to recombination. In this article, we argue that the breeder can take advantage of the epistatic marker effects in regions of low recombination. The models introduced here aim to estimate local epistatic line heritability by using genetic map information and combining local additive and epistatic effects. To this end, we have used semiparametric mixed models with multiple local genomic relationship matrices with hierarchical designs. Elastic-net postprocessing was used to introduce sparsity. Our models produce good predictive performance along with useful explanatory information.  相似文献   

3.
Genomic best linear unbiased prediction (BLUP) is a statistical method that uses relationships between individuals calculated from single-nucleotide polymorphisms (SNPs) to capture relationships at quantitative trait loci (QTL). We show that genomic BLUP exploits not only linkage disequilibrium (LD) and additive-genetic relationships, but also cosegregation to capture relationships at QTL. Simulations were used to study the contributions of those types of information to accuracy of genomic estimated breeding values (GEBVs), their persistence over generations without retraining, and their effect on the correlation of GEBVs within families. We show that accuracy of GEBVs based on additive-genetic relationships can decline with increasing training data size and speculate that modeling polygenic effects via pedigree relationships jointly with genomic breeding values using Bayesian methods may prevent that decline. Cosegregation information from half sibs contributes little to accuracy of GEBVs in current dairy cattle breeding schemes but from full sibs it contributes considerably to accuracy within family in corn breeding. Cosegregation information also declines with increasing training data size, and its persistence over generations is lower than that of LD, suggesting the need to model LD and cosegregation explicitly. The correlation between GEBVs within families depends largely on additive-genetic relationship information, which is determined by the effective number of SNPs and training data size. As genomic BLUP cannot capture short-range LD information well, we recommend Bayesian methods with t-distributed priors.  相似文献   

4.
The use of dense SNPs to predict the genetic value of an individual for a complex trait is often referred to as “genomic selection” in livestock and crops, but is also relevant to human genetics to predict, for example, complex genetic disease risk. The accuracy of prediction depends on the strength of linkage disequilibrium (LD) between SNPs and causal mutations. If sequence data were used instead of dense SNPs, accuracy should increase because causal mutations are present, but demographic history and long-term negative selection also influence accuracy. We therefore evaluated genomic prediction, using simulated sequence in two contrasting populations: one reducing from an ancestrally large effective population size (Ne) to a small one, with high LD common in domestic livestock, while the second had a large constant-sized Ne with low LD similar to that in some human or outbred plant populations. There were two scenarios in each population; causal variants were either neutral or under long-term negative selection. For large Ne, sequence data led to a 22% increase in accuracy relative to ∼600K SNP chip data with a Bayesian analysis and a more modest advantage with a BLUP analysis. This advantage increased when causal variants were influenced by negative selection, and accuracy persisted when 10 generations separated reference and validation populations. However, in the reducing Ne population, there was little advantage for sequence even with negative selection. This study demonstrates the joint influence of demography and selection on accuracy of prediction and improves our understanding of how best to exploit sequence for genomic prediction.  相似文献   

5.
Yi Jia  Jean-Luc Jannink 《Genetics》2012,192(4):1513-1522
Genetic correlations between quantitative traits measured in many breeding programs are pervasive. These correlations indicate that measurements of one trait carry information on other traits. Current single-trait (univariate) genomic selection does not take advantage of this information. Multivariate genomic selection on multiple traits could accomplish this but has been little explored and tested in practical breeding programs. In this study, three multivariate linear models (i.e., GBLUP, BayesA, and BayesCπ) were presented and compared to univariate models using simulated and real quantitative traits controlled by different genetic architectures. We also extended BayesA with fixed hyperparameters to a full hierarchical model that estimated hyperparameters and BayesCπ to impute missing phenotypes. We found that optimal marker-effect variance priors depended on the genetic architecture of the trait so that estimating them was beneficial. We showed that the prediction accuracy for a low-heritability trait could be significantly increased by multivariate genomic selection when a correlated high-heritability trait was available. Further, multiple-trait genomic selection had higher prediction accuracy than single-trait genomic selection when phenotypes are not available on all individuals and traits. Additional factors affecting the performance of multiple-trait genomic selection were explored.  相似文献   

6.
The term “effect” in additive genetic effect suggests a causal meaning. However, inferences of such quantities for selection purposes are typically viewed and conducted as a prediction task. Predictive ability as tested by cross-validation is currently the most acceptable criterion for comparing models and evaluating new methodologies. Nevertheless, it does not directly indicate if predictors reflect causal effects. Such evaluations would require causal inference methods that are not typical in genomic prediction for selection. This suggests that the usual approach to infer genetic effects contradicts the label of the quantity inferred. Here we investigate if genomic predictors for selection should be treated as standard predictors or if they must reflect a causal effect to be useful, requiring causal inference methods. Conducting the analysis as a prediction or as a causal inference task affects, for example, how covariates of the regression model are chosen, which may heavily affect the magnitude of genomic predictors and therefore selection decisions. We demonstrate that selection requires learning causal genetic effects. However, genomic predictors from some models might capture noncausal signal, providing good predictive ability but poorly representing true genetic effects. Simulated examples are used to show that aiming for predictive ability may lead to poor modeling decisions, while causal inference approaches may guide the construction of regression models that better infer the target genetic effect even when they underperform in cross-validation tests. In conclusion, genomic selection models should be constructed to aim primarily for identifiability of causal genetic effects, not for predictive ability.  相似文献   

7.
Through the theoretical analysis of the admixture linkage disequilibrium (ALD) in the gradual admixture (GA) model, in which admixture occurs in every generation, the ALD is found to be proportional to the difference in marker allele frequencies, p1-p2, between two subpopulations. Based on this property, we can employ a strict monotonic function (Δker=Δ/(p1-p2), where Δ denotes the linkage disequilibrium (LD)) of the recombination fraction between the marker locus and the disease locus to infer the true genetic linkage. We construct a quasi likelihood ratio test (LRT) for the case-only data utilizing the information of unlinked markers in the human genome. The simulation results show that our tests can be used to fine map a disease locus. The effects of parameter values in the ALD mapping are also discussed.  相似文献   

8.
郭伟  冯荣锦 《遗传学报》2006,33(1):12-18
在渐近混合模型中,混合现象发生在每一世代,通过对其混合连锁不平衡的理论分析,发现混合连锁不平衡与两个子群体间的基因频率差成正比。基于这一点,构造了一个对重组率严格单调的函数(△ker=△/(p1-p2),其中△代表连锁不平衡),进而据此推断标记基因座与疾病基因座的遗传连锁。应用人类基因组上不连锁的标记基因提供的连锁不平衡信息,基于病人组数据构造了一个准似然比统计量。模拟结果显示,此检验可用于精确的基因定位。文章亦讨论了参数对检验的影响。  相似文献   

9.
10.
Shizhong Xu 《Genetics》2013,195(3):1103-1115
The correct models for quantitative trait locus mapping are the ones that simultaneously include all significant genetic effects. Such models are difficult to handle for high marker density. Improving statistical methods for high-dimensional data appears to have reached a plateau. Alternative approaches must be explored to break the bottleneck of genomic data analysis. The fact that all markers are located in a few chromosomes of the genome leads to linkage disequilibrium among markers. This suggests that dimension reduction can also be achieved through data manipulation. High-density markers are used to infer recombination breakpoints, which then facilitate construction of bins. The bins are treated as new synthetic markers. The number of bins is always a manageable number, on the order of a few thousand. Using the bin data of a recombinant inbred line population of rice, we demonstrated genetic mapping, using all bins in a simultaneous manner. To facilitate genomic selection, we developed a method to create user-defined (artificial) bins, in which breakpoints are allowed within bins. Using eight traits of rice, we showed that artificial bin data analysis often improves the predictability compared with natural bin data analysis. Of the eight traits, three showed high predictability, two had intermediate predictability, and two had low predictability. A binary trait with a known gene had predictability near perfect. Genetic mapping using bin data points to a new direction of genomic data analysis.  相似文献   

11.
Prediction of genetic risk for disease is needed for preventive and personalized medicine. Genome-wide association studies have found unprecedented numbers of variants associated with complex human traits and diseases. However, these variants explain only a small proportion of genetic risk. Mounting evidence suggests that many traits, relevant to public health, are affected by large numbers of small-effect genes and that prediction of genetic risk to those traits and diseases could be improved by incorporating large numbers of markers into whole-genome prediction (WGP) models. We developed a WGP model incorporating thousands of markers for prediction of skin cancer risk in humans. We also considered other ways of incorporating genetic information into prediction models, such as family history or ancestry (using principal components, PCs, of informative markers). Prediction accuracy was evaluated using the area under the receiver operating characteristic curve (AUC) estimated in a cross-validation. Incorporation of genetic information (i.e., familial relationships, PCs, or WGP) yielded a significant increase in prediction accuracy: from an AUC of 0.53 for a baseline model that accounted for nongenetic covariates to AUCs of 0.58 (pedigree), 0.62 (PCs), and 0.64 (WGP). In summary, prediction of skin cancer risk could be improved by considering genetic information and using a large number of single-nucleotide polymorphisms (SNPs) in a WGP model, which allows for the detection of patterns of genetic risk that are above and beyond those that can be captured using family history. We discuss avenues for improving prediction accuracy and speculate on the possible use of WGP to prospectively identify individuals at high risk.  相似文献   

12.
Intense structuring of plant breeding populations challenges the design of the training set (TS) in genomic selection (GS). An important open question is how the TS should be constructed from multiple related or unrelated small biparental families to predict progeny from individual crosses. Here, we used a set of five interconnected maize (Zea mays L.) populations of doubled-haploid (DH) lines derived from four parents to systematically investigate how the composition of the TS affects the prediction accuracy for lines from individual crosses. A total of 635 DH lines genotyped with 16,741 polymorphic SNPs were evaluated for five traits including Gibberella ear rot severity and three kernel yield component traits. The populations showed a genomic similarity pattern, which reflects the crossing scheme with a clear separation of full sibs, half sibs, and unrelated groups. Prediction accuracies within full-sib families of DH lines followed closely theoretical expectations, accounting for the influence of sample size and heritability of the trait. Prediction accuracies declined by 42% if full-sib DH lines were replaced by half-sib DH lines, but statistically significantly better results could be achieved if half-sib DH lines were available from both instead of only one parent of the validation population. Once both parents of the validation population were represented in the TS, including more crosses with a constant TS size did not increase accuracies. Unrelated crosses showing opposite linkage phases with the validation population resulted in negative or reduced prediction accuracies, if used alone or in combination with related families, respectively. We suggest identifying and excluding such crosses from the TS. Moreover, the observed variability among populations and traits suggests that these uncertainties must be taken into account in models optimizing the allocation of resources in GS.  相似文献   

13.
In genome-based prediction there is considerable uncertainty about the statistical model and method required to maximize prediction accuracy. For traits influenced by a small number of quantitative trait loci (QTL), predictions are expected to benefit from methods performing variable selection [e.g., BayesB or the least absolute shrinkage and selection operator (LASSO)] compared to methods distributing effects across the genome [ridge regression best linear unbiased prediction (RR-BLUP)]. We investigate the assumptions underlying successful variable selection by combining computer simulations with large-scale experimental data sets from rice (Oryza sativa L.), wheat (Triticum aestivum L.), and Arabidopsis thaliana (L.). We demonstrate that variable selection can be successful when the number of phenotyped individuals is much larger than the number of causal mutations contributing to the trait. We show that the sample size required for efficient variable selection increases dramatically with decreasing trait heritabilities and increasing extent of linkage disequilibrium (LD). We contrast and discuss contradictory results from simulation and experimental studies with respect to superiority of variable selection methods over RR-BLUP. Our results demonstrate that due to long-range LD, medium heritabilities, and small sample sizes, superiority of variable selection methods cannot be expected in plant breeding populations even for traits like FRIGIDA gene expression in Arabidopsis and flowering time in rice, assumed to be influenced by a few major QTL. We extend our conclusions to the analysis of whole-genome sequence data and infer upper bounds for the number of causal mutations which can be identified by LASSO. Our results have major impact on the choice of statistical method needed to make credible inferences about genetic architecture and prediction accuracy of complex traits.  相似文献   

14.
Multiparental designs combined with dense genotyping of parents have been proposed as a way to increase the diversity and resolution of quantitative trait loci (QTL) mapping studies, using methods combining linkage disequilibrium information with linkage analysis (LDLA). Two new nested association mapping designs adapted to European conditions were derived from the complementary dent and flint heterotic groups of maize (Zea mays L.). Ten biparental dent families (N = 841) and 11 biparental flint families (N = 811) were genotyped with 56,110 single nucleotide polymorphism markers and evaluated as test crosses with the central line of the reciprocal design for biomass yield, plant height, and precocity. Alleles at candidate QTL were defined as (i) parental alleles, (ii) haplotypic identity by descent, and (iii) single-marker groupings. Between five and 16 QTL were detected depending on the model, trait, and genetic group considered. In the flint design, a major QTL (R2 = 27%) with pleiotropic effects was detected on chromosome 10, whereas other QTL displayed milder effects (R2 < 10%). On average, the LDLA models detected more QTL but generally explained lower percentages of variance, consistent with the fact that most QTL display complex allelic series. Only 15% of the QTL were common to the two designs. A joint analysis of the two designs detected between 15 and 21 QTL for the five traits. Of these, between 27 for silking date and 41% for tasseling date were significant in both groups. Favorable allelic effects detected in both groups open perspectives for improving biomass production.  相似文献   

15.
We analysed a QTL affecting milk yield (MY), milk protein yield (PY) and milk fat yield (FY) in the dual purpose cattle breed Fleckvieh on BTA5. Twenty-six microsatellite markers covering 135 cM were selected to analyse nine half-sib families containing 605 sons in a granddaughter design. We thereby assigned two new markers to the public linkage map using the CRI-MAP program. Phenotypic records were daughter yield deviations (DYD) originating from the routinely performed genetic evaluations of breeding animals. To determine the position of the QTL, three different approaches were applied: interval mapping (IM), linkage analysis by variance component analysis (LAVC), and combined linkage disequilibrium (LD) and linkage (LDL) analysis. All three methods mapped the QTL in the same marker interval ( BM2830-ETH152 ) with the greatest test-statistic value at 118, 119.33 and 119.33 cM respectively. The positive QTL allele simultaneously increases DYD in the first lactation by 272 kg milk, 7.1 kg milk protein and 7.0 kg milk fat. Although the mapping accuracy and the significance of a QTL effect increased from IM over LAVC to LDL, the confidence interval was large (13, 20 and 24 cM for FY, MY and PY respectively) for the positional cloning of the causal gene. The estimated averages of pair wise marker LD with a distance <5 cM were low (0.107) and reflect the large effective population size of the Fleckvieh subpopulation analysed. This low level of LD suggests a need for increase in marker density in following fine mapping steps.  相似文献   

16.
Polydextrose is a randomly linked complex glucose oligomer that is widely used as a sugar replacer, bulking agent, dietary fiber and prebiotic. Polydextrose is poorly utilized by the host and, during gastrointestinal transit, it is slowly degraded by intestinal microbes, although it is not known which parts of the complex molecule are preferred by the microbes. The microbial degradation of polydextrose was assessed by using a simulated model of colonic fermentation. The degradation products and their glycosidic linkages were measured by combined gas chromatography and mass spectrometry, and compared to those of intact polydextrose. Fermentation resulted in an increase in the relative abundance of non-branched molecules with a concomitant decrease in single-branched glucose molecules and a reduced total number of branching points. A detailed analysis showed a preponderance of 1,6 pyranose linkages. The results of this study demonstrate how intestinal microbes selectively degrade polydextrose, and provide an insight into the preferences of gut microbiota in the presence of different glycosidic linkages.  相似文献   

17.
M. Loukas  Y. Vergini    C. B. Krimbas 《Genetics》1981,97(2):429-441
Urea denaturation of allozymes was used to provide finer resolution of allelic states within classes of different electrophoretic mobility. This method gives perfectly repeatable results. About 170 isogenic strains for the O chromosome of Drosophila subobscura, derived from two natural populations, were constructed. Their gene arrangements were studied, as well as eight polymorphic genes located on the O chromosome (Est-5, Odh, Ao, ME, Xdh, Lap, Pept-1 and Acph). Crosses performed indicate that differences in urea sensitivity are genetically controlled by the same genes that control electrophoretic mobility. Twice as many alleles have been detected in comparison to the usual electrophoretic method. However, the effective number of alleles did not increase considerably.Studies of linkage disequilibria, by taking into account the finer resolution of allelic states, gave results nearly identical with those obtained in studies where the usual electrophoretic method was used. Although the power of the test is diminished, the absence of genic associations seems to indicate that there are no hidden linkage disequilibria in electrophoretic studies (because of consolidation effects of real alleles into few electromorph classes). The paucity of linkage disequilibria would indicate that there are no epistatic interactions such as those suggested in the model of Franklin and Lewontin (1970).  相似文献   

18.
Tanaka N  Yokoyama T  Abe H  Ninagi O  Oshiki T 《Genetica》2002,114(1):89-94
To analyze the degree of pairing of the Z and W chromosomes in ZZWW tetraploid female silkworms that have the W chromosomes of the domesticated silkworm, Bombyx mori, and those of the wild silkworm, Bombyx mandarina, we induced two types of ZZWW tetraploid female silkworms (Cr4n, Wr4n) through cold treatment of the eggs. The Wr4n female is congenic to the Cr4n female for W chromosomes; namely, the W chromosomes of the Wr4n female are derived from those of B. mandarina. Each of the sex ratios (/) in filial triploids from the Cr4n females was shown to be in the range of 3.9–5.3 (4.6 as an average of six cases). On the other hand, each of the sex ratios (/) in filial triploids from the Wr4n females was shown to be in the range of 6.2–9.0 (6.9 as an average of nine cases). The results of a t-test indicated that the difference in sex ratios in the two groups is highly significant (at the 0.1% level). These results suggest that, in the meiosis of the ZZWW tetraploid female, the frequency of pairing of the W chromosome of B. mandarina and the Z chromosome of B. mori is lower than that of the pairing of the W and Z chromosomes of B. mori. Furthermore, the t-test results are evidence that the W chromosomes have undergone significant evolutional change.  相似文献   

19.
The significant enhancing effect of glutamate on DNA binding by Escherichia coli nucleic acid binding proteins has been extensively documented. Glutamate has also often been observed to reduce the apparent linked ion release (Δnions) upon DNA binding. In this study, it is shown that the Klenow and Klentaq large fragments of the Type I DNA polymerases from E. coli and Thermus aquaticus both display enhanced DNA binding affinity in the presence of glutamate versus chloride. Across the relatively narrow salt concentration ranges often used to obtain salt linkage data, Klenow displays an apparently decreased Δnions in the presence of Kglutamate, while Klentaq appears not to display an anion-specific effect on Δnions.Osmotic stress experiments reveal that DNA binding by Klenow and Klentaq is associated with the release of ∼ 500 to 600 waters in the presence of KCl. For both proteins, replacing chloride with glutamate results in a 70% reduction in the osmotic-stress-measured hydration change associated with DNA binding (to ∼ 150-200 waters released), suggesting that glutamate plays a significant osmotic role.Measurements of the salt-DNA binding linkages were extended up to 2.5 M Kglutamate to further examine this osmotic effect of glutamate, and it is observed that a reversal of the salt linkage occurs above 800 mM for both Klenow and Klentaq. Salt-addition titrations confirm that an increase of [Kglutamate] beyond 1 M results in rebinding of salt-displaced polymerase to DNA. These data represent a rare documentation of a reversed ion linkage for a protein-DNA interaction (i.e., enhanced binding as salt concentration increases). Nonlinear linkage analysis indicates that this unusual behavior can be quantitatively accounted for by a shifting balance of ionic and osmotic effects as [Kglutamate] is increased. These results are predicted to be general for protein-DNA interactions in glutamate salts.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号