首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
The use of dense SNPs to predict the genetic value of an individual for a complex trait is often referred to as “genomic selection” in livestock and crops, but is also relevant to human genetics to predict, for example, complex genetic disease risk. The accuracy of prediction depends on the strength of linkage disequilibrium (LD) between SNPs and causal mutations. If sequence data were used instead of dense SNPs, accuracy should increase because causal mutations are present, but demographic history and long-term negative selection also influence accuracy. We therefore evaluated genomic prediction, using simulated sequence in two contrasting populations: one reducing from an ancestrally large effective population size (Ne) to a small one, with high LD common in domestic livestock, while the second had a large constant-sized Ne with low LD similar to that in some human or outbred plant populations. There were two scenarios in each population; causal variants were either neutral or under long-term negative selection. For large Ne, sequence data led to a 22% increase in accuracy relative to ∼600K SNP chip data with a Bayesian analysis and a more modest advantage with a BLUP analysis. This advantage increased when causal variants were influenced by negative selection, and accuracy persisted when 10 generations separated reference and validation populations. However, in the reducing Ne population, there was little advantage for sequence even with negative selection. This study demonstrates the joint influence of demography and selection on accuracy of prediction and improves our understanding of how best to exploit sequence for genomic prediction.  相似文献   

2.
In plant and animal breeding studies a distinction is made between the genetic value (additive plus epistatic genetic effects) and the breeding value (additive genetic effects) of an individual since it is expected that some of the epistatic genetic effects will be lost due to recombination. In this article, we argue that the breeder can take advantage of the epistatic marker effects in regions of low recombination. The models introduced here aim to estimate local epistatic line heritability by using genetic map information and combining local additive and epistatic effects. To this end, we have used semiparametric mixed models with multiple local genomic relationship matrices with hierarchical designs. Elastic-net postprocessing was used to introduce sparsity. Our models produce good predictive performance along with useful explanatory information.  相似文献   

3.
Plant breeding populations exhibit varying levels of structure and admixture; these features are likely to induce heterogeneity of marker effects across subpopulations. Traditionally, structure has been dealt with as a potential confounder, and various methods exist to “correct” for population stratification. However, these methods induce a mean correction that does not account for heterogeneity of marker effects. The animal breeding literature offers a few recent studies that consider modeling genetic heterogeneity in multibreed data, using multivariate models. However, these methods have received little attention in plant breeding where population structure can have different forms. In this article we address the problem of analyzing data from heterogeneous plant breeding populations, using three approaches: (a) a model that ignores population structure [A-genome-based best linear unbiased prediction (A-GBLUP)], (b) a stratified (i.e., within-group) analysis (W-GBLUP), and (c) a multivariate approach that uses multigroup data and accounts for heterogeneity (MG-GBLUP). The performance of the three models was assessed on three different data sets: a diversity panel of rice (Oryza sativa), a maize (Zea mays L.) half-sib panel, and a wheat (Triticum aestivum L.) data set that originated from plant breeding programs. The estimated genomic correlations between subpopulations varied from null to moderate, depending on the genetic distance between subpopulations and traits. Our assessment of prediction accuracy features cases where ignoring population structure leads to a parsimonious more powerful model as well as others where the multivariate and stratified approaches have higher predictive power. In general, the multivariate approach appeared slightly more robust than either the A- or the W-GBLUP.  相似文献   

4.
The term “effect” in additive genetic effect suggests a causal meaning. However, inferences of such quantities for selection purposes are typically viewed and conducted as a prediction task. Predictive ability as tested by cross-validation is currently the most acceptable criterion for comparing models and evaluating new methodologies. Nevertheless, it does not directly indicate if predictors reflect causal effects. Such evaluations would require causal inference methods that are not typical in genomic prediction for selection. This suggests that the usual approach to infer genetic effects contradicts the label of the quantity inferred. Here we investigate if genomic predictors for selection should be treated as standard predictors or if they must reflect a causal effect to be useful, requiring causal inference methods. Conducting the analysis as a prediction or as a causal inference task affects, for example, how covariates of the regression model are chosen, which may heavily affect the magnitude of genomic predictors and therefore selection decisions. We demonstrate that selection requires learning causal genetic effects. However, genomic predictors from some models might capture noncausal signal, providing good predictive ability but poorly representing true genetic effects. Simulated examples are used to show that aiming for predictive ability may lead to poor modeling decisions, while causal inference approaches may guide the construction of regression models that better infer the target genetic effect even when they underperform in cross-validation tests. In conclusion, genomic selection models should be constructed to aim primarily for identifiability of causal genetic effects, not for predictive ability.  相似文献   

5.
Shizhong Xu 《Genetics》2013,195(3):1103-1115
The correct models for quantitative trait locus mapping are the ones that simultaneously include all significant genetic effects. Such models are difficult to handle for high marker density. Improving statistical methods for high-dimensional data appears to have reached a plateau. Alternative approaches must be explored to break the bottleneck of genomic data analysis. The fact that all markers are located in a few chromosomes of the genome leads to linkage disequilibrium among markers. This suggests that dimension reduction can also be achieved through data manipulation. High-density markers are used to infer recombination breakpoints, which then facilitate construction of bins. The bins are treated as new synthetic markers. The number of bins is always a manageable number, on the order of a few thousand. Using the bin data of a recombinant inbred line population of rice, we demonstrated genetic mapping, using all bins in a simultaneous manner. To facilitate genomic selection, we developed a method to create user-defined (artificial) bins, in which breakpoints are allowed within bins. Using eight traits of rice, we showed that artificial bin data analysis often improves the predictability compared with natural bin data analysis. Of the eight traits, three showed high predictability, two had intermediate predictability, and two had low predictability. A binary trait with a known gene had predictability near perfect. Genetic mapping using bin data points to a new direction of genomic data analysis.  相似文献   

6.
    
Doubled haploids are routinely created and phenotypically selected in plant breeding programs to accelerate the breeding cycle. Genomic selection, which makes use of both phenotypes and genotypes, has been shown to further improve genetic gain through prediction of performance before or without phenotypic characterization of novel germplasm. Additional opportunities exist to combine genomic prediction methods with the creation of doubled haploids. Here we propose an extension to genomic selection, optimal haploid value (OHV) selection, which predicts the best doubled haploid that can be produced from a segregating plant. This method focuses selection on the haplotype and optimizes the breeding program toward its end goal of generating an elite fixed line. We rigorously tested OHV selection breeding programs, using computer simulation, and show that it results in up to 0.6 standard deviations more genetic gain than genomic selection. At the same time, OHV selection preserved a substantially greater amount of genetic diversity in the population than genomic selection, which is important to achieve long-term genetic gain in breeding populations.  相似文献   

7.
8.
Yong Jiang  Jochen C. Reif 《Genetics》2015,201(2):759-768
Modeling epistasis in genomic selection is impeded by a high computational load. The extended genomic best linear unbiased prediction (EG-BLUP) with an epistatic relationship matrix and the reproducing kernel Hilbert space regression (RKHS) are two attractive approaches that reduce the computational load. In this study, we proved the equivalence of EG-BLUP and genomic selection approaches, explicitly modeling epistatic effects. Moreover, we have shown why the RKHS model based on a Gaussian kernel captures epistatic effects among markers. Using experimental data sets in wheat and maize, we compared different genomic selection approaches and concluded that prediction accuracy can be improved by modeling epistasis for selfing species but may not for outcrossing species.  相似文献   

9.
Although the concept of genomic selection relies on linkage disequilibrium (LD) between quantitative trait loci and markers, reliability of genomic predictions is strongly influenced by family relationships. In this study, we investigated the effects of LD and family relationships on reliability of genomic predictions and the potential of deterministic formulas to predict reliability using population parameters in populations with complex family structures. Five groups of selection candidates were simulated by taking different information sources from the reference population into account: (1) allele frequencies, (2) LD pattern, (3) haplotypes, (4) haploid chromosomes, and (5) individuals from the reference population, thereby having real family relationships with reference individuals. Reliabilities were predicted using genomic relationships among 529 reference individuals and their relationships with selection candidates and with a deterministic formula where the number of effective chromosome segments (Me) was estimated based on genomic and additive relationship matrices for each scenario. At a heritability of 0.6, reliabilities based on genomic relationships were 0.002 ± 0.0001 (allele frequencies), 0.022 ± 0.001 (LD pattern), 0.018 ± 0.001 (haplotypes), 0.100 ± 0.008 (haploid chromosomes), and 0.318 ± 0.077 (family relationships). At a heritability of 0.1, relative differences among groups were similar. For all scenarios, reliabilities were similar to predictions with a deterministic formula using estimated Me. So, reliabilities can be predicted accurately using empirically estimated Me and level of relationship with reference individuals has a much higher effect on the reliability than linkage disequilibrium per se. Furthermore, accumulated length of shared haplotypes is more important in determining the reliability of genomic prediction than the individual shared haplotype length.  相似文献   

10.
Heritability is a population parameter of importance in evolution, plant and animal breeding, and human medical genetics. It can be estimated using pedigree designs and, more recently, using relationships estimated from markers. We derive the sampling variance of the estimate of heritability for a wide range of experimental designs, assuming that estimation is by maximum likelihood and that the resemblance between relatives is solely due to additive genetic variation. We show that well-known results for balanced designs are special cases of a more general unified framework. For pedigree designs, the sampling variance is inversely proportional to the variance of relationship in the pedigree and it is proportional to 1/N, whereas for population samples it is approximately proportional to 1/N2, where N is the sample size. Variation in relatedness is a key parameter in the quantification of the sampling variance of heritability. Consequently, the sampling variance is high for populations with large recent effective population size (e.g., humans) because this causes low variation in relationship. However, even using human population samples, low sampling variance is possible with high N.  相似文献   

11.
The current study was designed to investigate the effects of the purH gene on chicken muscle inosine monophosphate (IMP) content. Muscle IMP content was measured in five chicken breeds. Single nucleotide polymorphisms (SNPs) were detected by PCR-SSCP and DNA sequencing. Two SNPs were detected, A/T substitution at position 8023 in exon 9, and T/C substitution at position 17446 in exon 16. The results indicated that only T17446C polymorphism was associated with IMP content. The haplotype effect was higher than the single genotype effect. We tentatively conclude that purH gene is a candidate locus or linked to a major gene that affects muscle IMP content. Haplotypes are superior to single genotypes as potential molecular markers for meat quality traits in chicken.  相似文献   

12.
13.
甘蓝型油菜含油量的主基因+多基因遗传效应分析   总被引:13,自引:0,他引:13  
应用多世代联合分析数量性状主基因和多基因混合遗传的统计方法,分析了甘蓝型油菜两个组合的5个世代——亲本P1、P2、F1、F2和F2:3家系材料含油量的遗传效应。结果表明,分离世代F2及F2:3家系含油量次数分布均呈混合的正态分布,符合主基因+多基因的遗传特征。D-2模型是该项研究两个甘蓝型油菜杂交组合含油量的最适遗传模型,含油量的遗传是由一对加性主基因和加-显性多基因共同控制的。组合1(1141Bx垦C1-1)主基因加性效应值为-1.74,表明亲本1141B中主基因位点上的等位基因降低含油量,而亲本垦C1-1中的等位基因增加含油量。多基因加性效应值和显性效应值分别为1.20和-1.93;F2的主基因遗传力和多基因遗传力分别为68.21%和27.17%;F2:3的主基因遗传力和多基因遗传力分别为81.70%和16.80%。组合2(32Bx垦C1-2)主基因加性效应值为-3.74,表明亲本32B中主基因位点上的等位基因降低含油量,而亲本垦C1-2中的等位基因增加含油量。多基因加性效应值和显性效应值分别为-1.99和0.93;F2的主基因遗传力和多基因遗传力分别为66.20%,和28.10%;F2:3的主基因遗传力和多基因遗传力为81.00%和14.90%。两组合在F2:3家系世代含油量的主基因遗传力均较F2高,因此认为高含油量育种中在F2:3家系进行选择效率较高。  相似文献   

14.
运用鸟枪法测序技术,得到了位于3p25.1和3p26.1区域、大小分别为328kb和753kb的2个基因组片段,分析各片段的GC含量及重复顺序的分布特征,并研究不同区域内的基因分布密度。结果表明:位于3025.1区域的328kb片段具有较高的GC含量,在此区域内蛋白编码基因分布密度较高,而位于3p26.1区域的753kb片段平均GC含量较低,并且是基因贫乏区域。同时发现:GC—rich的SINE类重复顺序在GC含量较高的区域有较高的覆盖率,相反AT-rich的LINE类重复顺序在GC较低的区域分布较多,基因分布与基因组这一关系的形成是基因与基因组长期共进化的结果。  相似文献   

15.
The joint segregation analysis of a mixed genetic model of major gene plus poly-gene was conducted to study the inheritance of oil content in Brassica napus L. Five populations, i.e the populations of 2 parents(P1 and P2), F1, F2 and F2:3 (derived from F2) family, from each of the two crosses (1141B × Ken C1-1, 32B × Ken C1-2) were investigated. The frequency distributions of oil content in F2 and F2:3 family populations show characteristics of a mixed normal distribution, which indicated that the inheritance of oil content followed a major gene plus poly-gene model. Twenty-one genetic models were established, which could be classified into five types: one and two major genes, polygenes, one and two major genes plus polygenes. The most suitable genetic model could be selected using Akaike's Information Criterion and the fitness of the selected one could be examined by a set of tests. Results show that genetic model D-2 is the most fitting genetic model for the trait. In other words, oil content in oilseed rape is controlled by one additive major gene plus additive and dominance polygenes. For cross 1 (1141B × Ken C1-1) the heritabilities of major gene and poly-genes in F2 are 68.21% and 27.17%, respectively, and in F2:3 are 81.70% and 16.80%, respectively. The additive effect of major gene is-1.74, which indicates that the locus of the allele in parent 1141B may decrease the oil content, but that in parent Ken C1-1 may increase it. The additive and dominance effects of the polygenes are 1.20 and -1.93, respectively. For cross 2 (32B × Ken C1-2) the heritabilities of major gene and polygenes in F2 are 66.20% and 28.10%, respectively, and in F2:3 were 81.00% and 14.90%, respectively. The additive effect of major gene was -3.74, which also indicates that the locus of the allele in parent 32B may decrease the oil content, but that in parent Ken C1-2 may increase it. The additive and dominance effects are -1.99 and 0.93, respectively. The heritability of the major gene in F2:3 is higher than that in F2 in both crosses, so it would be more efficent to conduct selection in F2:3 families for high oil content in breeding.  相似文献   

16.
介绍了XD683电解质分析仪的质量控制,它涉及仪器本身、所用的试剂、分析过程的各个环节及仪器维护和检查校准等。  相似文献   

17.
分析了蜡质基因引导区的两个简单重复序列 (SSR) (CT) n 和 (AATT) n 在 74份水稻材料中的多态性及其与直链淀粉含量 (AC)的关系。这些材料包括了籼稻 (OryzasativaL .ssp .indica)、粳稻 (O .sativassp .japonica)和普通野生稻 (O .rufipogon) ,其AC值覆盖了栽培稻AC分布的整个范围。以 (CT) n 作标记检测到 8个等位基因 ,粳稻品种趋于含有重复数目较多 (n≥ 16 )的等位基因 ,重复次数较少 (n≤ 14)的等位基因只出现在籼稻中。 (AATT)n检测到 2个等位基因 ,野生稻中少数植株表现出杂合性。分析表明AC与这两个SSR序列基因型高度相关 ,高AC (>2 2 .0 % )品种具有 (CT)重复次数较少 (n≤ 14)的等位基因 ;相反 ,除了糯米外 ,所有低或者中等AC的品种都有 (CT)重复数较多 (n≥ 16 )的等位基因。具有重复次数较多的 (AATT) 6等位基因的品种多为高AC ,具有重复次数较少的(AATT) 5等位基因的品种多为低或中等AC。不同SSR基因型品种间AC差异极显著。虽然目前还不能确定这两个SSR序列在直链淀粉合成中的直接功能 ,SSR变异与AC间近乎完全的相关性可作为分子标记直接用于水稻的品质改良。  相似文献   

18.
19.
Cereal genes are classified into two distinct classes according to the guanine-cytosine(GC)content at the third codonsites(GC_3).Natural selection and mutation bias have been proposed to affect the GC content.However,there has beencontroversy about the cause of GC variation.Here,we characterized the GC content of 1092 paralogs and other single-copygenes in the duplicated chromosomal regions of the rice genome(ssp.indica)and classified the paralogs into GC_3-richand GC_3-poor groups.By referring to out-group sequences from Arabidopsis and maize,we confirmed that the averagesynonymous substitution rate of the GC_3-rich genes is significantly lower than that of the GC_3-poor genes.Furthermore,we explored the other possible factors corresponding to the GC variation including the length of coding sequences,thenumber of exons in each gene,the number of genes in each family,the location of genes on chromosomes and the proteinfunctions.Consequently,we propose that natural selection rather than mutation bias was the primary cause of the GCvariation.  相似文献   

20.
目的:建立北青龙衣质量控制的方法。方法:采用薄层色谱进行鉴别,展开条件为三氯甲烷-甲醇(8:2),采用高效液相色谱进行含量测定,以乙腈-水-磷酸(10:90:0.2)为流动相,指标成分为(4S)-4,5-二羟基-α-四氢萘酮-4-O-β-D-吡喃葡糖苷。结果:方法灵敏、可靠、准确、重复性好。结论:该方法可作为北青龙衣质量控制的方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号