首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Benjamin Stich 《Genetics》2009,183(4):1525-1534
The nested association mapping (NAM) strategy promises to combine the advantages of linkage mapping and association mapping. The objectives of my research were to (i) investigate by computer simulations the power and type I error rate for detecting quantitative trait loci (QTL) with additive effects using recombinant inbred line (RIL) populations of maize derived from various mating designs, (ii) compare these estimates to those obtained for RIL populations of Arabidopsis thaliana, (iii) examine for both species the optimum number of inbreds used as parents of the NAM populations, and (iv) provide on the basis of the results of these two model species a general guideline for the design of NAM populations in other plant species. The computer simulations were based on empirical data of a set of 26 diverse maize inbred lines and a set of 20 A. thaliana inbreds both representing a large part of the genetic diversity of the corresponding species. I observed considerable differences in the power for QTL detection between NAM populations of the same size but created on the basis of different crossing schemes. This finding illustrated the potential to improve the power for QTL detection without increasing the total resources necessary for a QTL mapping experiment. Furthermore, my results clearly indicated that it is advantageous to create NAM populations from a large number of parental inbreds.MANY traits that are important for fitness and agricultural value of plants are quantitative traits. Such traits are affected by many genes, the environment, and interactions between genes and the environment (Holland 2007). In plants, quantitative trait locus (QTL) mapping is a key tool for studying the genetic architecture of quantitative traits (Yano 2001). This method enables the estimation of (i) the number of genome regions affecting a trait, (ii) the distribution of gene effects, and (iii) the relative importance of additive and nonadditive gene action.Until now, most of the plant QTL mapping studies have been based on linkage mapping methods using individual biparental populations. The major limitations of such approaches are a poor resolution in detecting QTL and that with biparental crosses of inbred lines only two alleles at any given locus can be studied simultaneously (Flint-Garcia et al. 2005). Association mapping methods, which are successfully applied in human genetics to detect genes coding for human diseases (e.g., Willer et al. 2008), promise to overcome these limitations (Kraakman et al. 2004). However, in comparison with linkage mapping approaches, association mapping approaches have only a low power to detect QTL in genomewide scans (Yu and Buckler 2006).The nested association mapping (NAM) strategy proposed by Yu et al. (2008) uses recombinant inbred line (RIL) populations derived from several crosses of parental inbreds. Due to diminishing chances of recombination over short genetic distance and a given number of generations, the genomes of these RILs are mosaics of chromosomal segments of their parental genomes. Consequently, within the chromosomal segments, the linkage disequilibrium (LD) information across the parental inbreds is maintained. Thus, if diverse parental inbreds are used, LD decays within the chromosomal segments of the RILs over a short physical distance (Wilson et al. 2004). Therefore, the NAM strategy allows to exploit both recent and ancient recombination and, thus, will show a high mapping resolution (Yu et al. 2008). Furthermore, due to the balanced design underlying the proposed mapping strategy as well as the systematic reshuffling of the genomes of the parental inbreds during RIL development, NAM populations are expected to show a high power to detect QTL in genomewide approaches (Buckler et al. 2009).Exploitation of the advantages of the NAM strategy requires developing, genotyping, and phenotyping of RIL populations from several crosses of diverse parental inbreds. This, however, requires large financial resources (cf. Yu et al. 2008). Therefore, it is mandatory that the available resources are spent in an optimum way.Stich et al. (2009) examined the optimum allocation of resources for NAM in maize with respect to the number of RILs derived from the reference design as well as the number of environments and replications per environment used for phenotypic evaluation. The power for QTL detection, however, is expected to be influenced not only by these factors but also by the crossing scheme from which RIL populations are derived. To my knowledge, no study has so far compared RIL populations derived from various mating designs regarding the power for detecting QTL with additive effects. Furthermore, no information is available on the optimum number of inbreds used as parents of the NAM populations.For Arabidopsis thaliana, more advanced genomic tools are available than for most other plant species (e.g., Alonso et al. 2003; Clark et al. 2007). This fact increases the prospects of success of NAM approaches. However, A. thaliana differs from maize with respect to the genome size and the allele frequency, which both have the potential to influence the power for QTL detection. Nevertheless, to my knowledge, no study has so far examined the power of NAM in A. thaliana.The objectives of my research were to (i) investigate by computer simulations the power and type I error rate for detecting QTL with additive effects using RIL populations of maize derived from various mating designs, (ii) compare these estimates to those obtained for RIL populations of A. thaliana, (iii) examine for both species the optimum number of inbreds used as parents of the NAM populations, and (iv) provide on the basis of the results of these two model species a general guideline for the design of NAM populations in other plant species.  相似文献   

2.
Domesticates are an excellent model for understanding biological consequences of rapid climate change. Maize (Zea mays ssp. mays) was domesticated from a tropical grass yet is widespread across temperate regions today. We investigate the biological basis of temperate adaptation in diverse structured nested association mapping (NAM) populations from China, Europe (Dent and Flint) and the United States as well as in the Ames inbred diversity panel, using days to flowering as a proxy. Using cross-population prediction, where high prediction accuracy derives from overall genomic relatedness, shared genetic architecture, and sufficient diversity in the training population, we identify patterns in predictive ability across the five populations. To identify the source of temperate adapted alleles in these populations, we predict top associated genome-wide association study (GWAS) identified loci in a Random Forest Classifier using independent temperate–tropical North American populations based on lines selected from Hapmap3 as predictors. We find that North American populations are well predicted (AUC equals 0.89 and 0.85 for Ames and USNAM, respectively), European populations somewhat well predicted (AUC equals 0.59 and 0.67 for the Dent and Flint panels, respectively) and that the Chinese population is not predicted well at all (AUC is 0.47), suggesting an independent adaptation process for early flowering in China. Multiple adaptations for the complex trait days to flowering in maize provide hope for similar natural systems under climate change.Subject terms: Evolutionary genetics, Quantitative trait  相似文献   

3.

Key message

Impacts of population structure on the evaluation of genomic heritability and prediction were investigated and quantified using high-density markers in diverse panels in rice and maize.

Abstract

Population structure is an important factor affecting estimation of genomic heritability and assessment of genomic prediction in stratified populations. In this study, our first objective was to assess effects of population structure on estimations of genomic heritability using the diversity panels in rice and maize. Results indicate population structure explained 33 and 7.5 % of genomic heritability for rice and maize, respectively, depending on traits, with the remaining heritability explained by within-subpopulation variation. Estimates of within-subpopulation heritability were higher than that derived from quantitative trait loci identified in genome-wide association studies, suggesting 65 % improvement in genetic gains. The second objective was to evaluate effects of population structure on genomic prediction using cross-validation experiments. When population structure exists in both training and validation sets, correcting for population structure led to a significant decrease in accuracy with genomic prediction. In contrast, when prediction was limited to a specific subpopulation, population structure showed little effect on accuracy and within-subpopulation genetic variance dominated predictions. Finally, effects of genomic heritability on genomic prediction were investigated. Accuracies with genomic prediction increased with genomic heritability in both training and validation sets, with the former showing a slightly greater impact. In summary, our results suggest that the population structure contribution to genomic prediction varies based on prediction strategies, and is also affected by the genetic architectures of traits and populations. In practical breeding, these conclusions may be helpful to better understand and utilize the different genetic resources in genomic prediction.  相似文献   

4.

Background

In mathematical epidemiology, age-structured epidemic models have usually been formulated as the boundary-value problems of the partial differential equations. On the other hand, in engineering, the backstepping method has recently been developed and widely studied by many authors.

Methods

Using the backstepping method, we obtained a boundary feedback control which plays the role of the threshold criteria for the prediction of increase or decrease of newly infected population. Under an assumption that the period of infectiousness is same for all infected individuals (that is, the recovery rate is given by the Dirac delta function multiplied by a sufficiently large positive constant), the prediction method is simplified to the comparison of the numbers of reported cases at the current and previous time steps.

Results

Our prediction method was applied to the reported cases per sentinel of influenza in Japan from 2006 to 2015 and its accuracy was 0.81 (404 correct predictions to the total 500 predictions). It was higher than that of the ARIMA models with different orders of the autoregressive part, differencing and moving-average process. In addition, a proposed method for the estimation of the number of reported cases, which is consistent with our prediction method, was better than that of the best-fitted ARIMA model ARIMA(1,1,0) in the sense of mean square error.

Conclusions

Our prediction method based on the backstepping method can be simplified to the comparison of the numbers of reported cases of the current and previous time steps. In spite of its simplicity, it can provide a good prediction for the spread of influenza in Japan.
  相似文献   

5.

Background:

Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated.

Results:

In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%.

Conclusion:

We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized.
  相似文献   

6.
Nested Association Mapping for Identification of Functional Markers   总被引:1,自引:0,他引:1  
Identification of functional markers (FMs) provides information about the genetic architecture underlying complex traits. An approach that combines the strengths of linkage and association mapping, referred to as nested association mapping (NAM), has been proposed to identify FMs in many plant species. The ability to identify and resolve FMs for complex traits depends upon a number of factors including frequency of FM alleles, magnitudes of their genetic effects, disequilibrium among functional and nonfunctional markers, statistical analysis methods, and mating design. The statistical characteristics of power, accuracy, and precision to identify FMs with a NAM population were investigated using three simulation studies. The simulated data sets utilized publicly available genetic sequences and simulated FMs were identified using least-squares variable selection methods. Results indicate that FMs with simple additive genetic effects that contribute at least 5% to the phenotypic variability in at least five segregating families of a NAM population consisting of recombinant inbred progeny derived from 28 matings with a single reference inbred will have adequate power to accurately and precisely identify FMs. This resolution and power are possible even for genetic architectures consisting of disequilibrium among multiple functional and nonfunctional markers in the same genomic region, although the resolution of FMs will deteriorate rapidly if more than two FMs are tightly linked within the same amplicon. Finally, nested mating designs involving several reference parents will have a greater likelihood of resolving FMs than single reference designs.THE primary purpose for identifying functional markers (FMs) associated with complex traits in plant species is to provide molecular genetic information underlying variability upon which both artificial and natural selection are based. FMs are defined as polymorphic sites within genomes that causally affect phenotypic trait variability (Andersen and Lubberstedt 2003). This definition is a pragmatic recognition that phenotypic variability can be due to genomic variability located outside of open reading frames. Forward genetics approaches to associate naturally occurring structural genomic variants with phenotypic variability can be broadly categorized as (1) linkage mapping, also referred to as quantitative trait locus (QTL) mapping, (2) association genetic mapping, also known as linkage disequilibrium (LD) mapping, and (3) designs that combine linkage and LD mapping.The third approach based on the concept of combining LD with QTL mapping is a natural extension of the multifamily QTL approach and has been referred as joint linkage and linkage disequilibrium mapping (JLLDM) (Xiong and Jin 2000; Farnir et al. 2002; Wu et al. 2002; Perez-Enciso 2003; Jung et al. 2005) in samples from natural populations. The combined approach also has been applied to designed mapping families sampled from plant breeding populations (Xu 1998a; Jannink and Jansen 2000; Jannink and Wu 2003; Jansen et al. 2003). A special case of designed mapping families that are interconnected, known as nested association mapping (NAM), was proposed by Yu et al. (2008). As originally proposed, a NAM population consists of multiple families of recombinant inbred lines (RILs) derived from multiple inbred lines crossed to a single reference inbred line. Implicitly, genomic information is composed of high-density genotypes of parental inbred lines and low-density genotypes from segregating progeny. If the segregating progeny are RILs or doubled haploid lines (DHLs), then the genomic information can be “immortalized” for associations with phenotypes obtained through long-term longitudinal studies (Nordborg and Weigel 2008).A NAM population consisting of 25 families with 200 RILs for each family has been developed and released as a genetic resource for identification of FMs in maize (Yu et al. 2008). Other publicly available NAM populations are being developed for several species including Arabidopsis thaliana (Buckler and Gore 2007), barley (R. Wise, personal communication), sorghum (J. Yu, personal communication), and soybean (B. Diers, personal communication).The power, accuracy, and precision of identifying FMs in experimental NAM populations have not been investigated for complex genetic architectures. These statistical properties depend upon a number of factors including the following:
  1. Data analysis method: Some methods are more powerful than others; however, experimental biologists prefer methods implemented in existing software packages. Are least-squares methods sufficiently powerful to identify FMs in established and developing NAM populations?
  2. Frequency of functional markers and magnitudes of genetic effects: Development of a NAM population will change the allele frequencies of the FM relative to the reference population from which the lines are sampled. How will allele frequency and magnitude of genetic effects in a typical NAM population affect the ability to identify FMs?
  3. Disequilibrium among functional and nonfunctional markers: Disequilibrium may exist among alleles within subpopulations even when there is no physical basis for genetic linkage. To what extent can the NAM design address consequences of gametic disequilibrium (population structure) in the reference population?
  4. Multiple FMs in the same genomic region: If multiple FMs are physically located in the same genomic region, will equilibrium among the parental lines enable resolution of multiple FMs?
  5. Mating design: An appropriate mating design can maximize the number of families that are informative for FMs. Will multiple-reference mating designs improve the probability of identifying FMs?
These five questions were addressed.  相似文献   

7.

Key message

Using NIR and NMR predictions of quality traits overcomes a major barrier for the application of genomic selection to accelerate improvement in grain end-use quality traits of wheat.

Abstract

Grain end-use quality traits are among the most important in wheat breeding. These traits are difficult to breed for, as their assays require flour quantities only obtainable late in the breeding cycle, and are expensive. These traits are therefore an ideal target for genomic selection. However, large reference populations are required for accurate genomic predictions, which are challenging to assemble for these traits for the same reasons they are challenging to breed for. Here, we use predictions of end-use quality derived from near infrared (NIR) or nuclear magnetic resonance (NMR), that require very small amounts of flour, as well as end-use quality measured by industry standard assay in a subset of accessions, in a multi-trait approach for genomic prediction. The NIR and NMR predictions were derived for 19 end-use quality traits in 398 accessions, and were then assayed in 2420 diverse wheat accessions. The accessions were grown out in multiple locations and multiple years, and were genotyped for 51208 SNP. Incorporating NIR and NMR phenotypes in the multi-trait approach increased the accuracy of genomic prediction for most quality traits. The accuracy ranged from 0 to 0.47 before the addition of the NIR/NMR data, while after these data were added, it ranged from 0 to 0.69. Genomic predictions were reasonably robust across locations and years for most traits. Using NIR and NMR predictions of quality traits overcomes a major barrier for the application of genomic selection for grain end-use quality traits in wheat breeding.
  相似文献   

8.

Background

Models that are capable of reliably predicting binding affinities for protein-ligand complexes play an important role the field of structure-guided drug design.

Methods

Here, we begin by applying the computational geometry technique of Delaunay tessellation to each set of atomic coordinates for over 1400 diverse macromolecular structures, for the purpose of deriving a four-body statistical potential that serves as a topological scoring function. Next, we identify a second, independent set of three hundred protein-ligand complexes, having both high-resolution structures and known dissociation constants. Two-thirds of these complexes are randomly selected to train a predictive model of binding affinity as follows: two tessellations are generated in each case, one for the entire complex and another strictly for the isolated protein without its bound ligand, and a topological score is computed for each tessellation with the four-body potential. Predicted protein-ligand binding affinity is then based on an empirically derived linear function of the difference between both topological scores, one that appropriately scales the value of this difference.

Results

A comparison between experimental and calculated binding affinity values over the two hundred complexes reveals a Pearson's correlation coefficient of r = 0.79 with a standard error of SE = 1.98 kcal/mol. To validate the method, we similarly generated two tessellations for each of the remaining protein-ligand complexes, computed their topological scores and the difference between the two scores for each complex, and applied the previously derived linear transformation of this topological score difference to predict binding affinities. For these one hundred complexes, we again observe a correlation of r = 0.79 (SE = 1.93 kcal/mol) between known and calculated binding affinities. Applying our model to an independent test set of high-resolution structures for three hundred diverse enzyme-inhibitor complexes, each with an experimentally known inhibition constant, also yields a correlation of r = 0.79 (SE = 2.39 kcal/mol) between experimental and calculated binding energies.

Conclusions

Lastly, we generate predictions with our model on a diverse test set of one hundred protein-ligand complexes previously used to benchmark 15 related methods, and our correlation of r = 0.66 between the calculated and experimental binding energies for this dataset exceeds those of the other approaches. Compared with these related prediction methods, our approach stands out based on salient features that include the reliability of our model, combined with the rapidity of the generated predictions, which are less than one second for an average sized complex.
  相似文献   

9.

Background

Genomic prediction is becoming a daily tool for plant breeders. It makes use of genotypic information to make predictions used for selection decisions. The accuracy of the predictions depends on the number of genotypes used in the calibration; hence, there is a need of combining data across years. A proper phenotypic analysis is a crucial prerequisite for accurate calibration of genomic prediction procedures. We compared stage-wise approaches to analyse a real dataset of a multi-environment trial (MET) in rye, which was connected between years only through one check, and used different spatial models to obtain better estimates, and thus, improved predictive abilities for genomic prediction. The aims of this study were to assess the advantage of using spatial models for the predictive abilities of genomic prediction, to identify suitable procedures to analyse a MET weakly connected across years using different stage-wise approaches, and to explore genomic prediction as a tool for selection of models for phenotypic data analysis.

Results

Using complex spatial models did not significantly improve the predictive ability of genomic prediction, but using row and column effects yielded the highest predictive abilities of all models. In the case of MET poorly connected between years, analysing each year separately and fitting year as a fixed effect in the genomic prediction stage yielded the most realistic predictive abilities. Predictive abilities can also be used to select models for phenotypic data analysis. The trend of the predictive abilities was not the same as the traditionally used Akaike information criterion, but favoured in the end the same models.

Conclusions

Making predictions using weakly linked datasets is of utmost interest for plant breeders. We provide an example with suggestions on how to handle such cases. Rather than relying on checks we show how to use year means across all entries for integrating data across years. It is further shown that fitting of row and column effects captures most of the heterogeneity in the field trials analysed.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-646) contains supplementary material, which is available to authorized users.  相似文献   

10.

Key message

We suggest multi-parental nested association mapping as a valuable innovation in barley genetics, which increases the power to map quantitative trait loci and assists in extending genetic diversity of the elite barley gene pool.

Abstract

Plant genetic resources are a key asset to further improve crop species. The nested association mapping (NAM) approach was introduced to identify favorable genes in multi-parental populations. Here, we report toward the development of the first explorative barley NAM population and demonstrate its usefulness in a study on mapping quantitative trait loci (QTLs) for leaf rust resistance. The NAM population HEB-5 was developed from crossing and backcrossing five exotic barley donors with the elite barley cultivar ‘Barke,’ resulting in 295 NAM lines in generation BC1S1. HEB-5 was genetically characterized with 1,536 barley SNPs. Across HEB-5 and within the NAM families, no deviation from the expected genotype and allele frequencies was detected. Genetic similarity between ‘Barke’ and the NAM families ranged from 78.6 to 83.1 %, confirming the backcrossing step during population development. To explore its usefulness, a screen for leaf rust (Puccinia hordei) seedling resistance was conducted. Resistance QTLs were mapped to six barley chromosomes, applying a mixed model genome-wide association study. In total, four leaf rust QTLs were detected across HEB-5 and four QTLs within family HEB-F23. Favorable exotic QTL alleles reduced leaf rust symptoms on two chromosomes by 33.3 and 36.2 %, respectively. The located QTLs may represent new resistance loci or correspond to new alleles of known resistance genes. We conclude that the exploratory population HEB-5 can be applied to mapping and utilizing exotic QTL alleles of agronomic importance. The NAM concept will foster the evaluation of the genetic diversity, which is present in our primary barley gene pool.  相似文献   

11.

Objective

to develop a new strategy combining near-infrared (NIR) and dielectric spectroscopies for real-time monitoring and in-depth characterizing populations of Chinese hamster ovary cells throughout cultures performed in bioreactors.

Results

Spectral data processing was based on off-line analyses of the cells, including trypan blue exclusion method, and lactate dehydrogenase activity (LDH). Viable cell density showed a linear correlation with permittivity up to 6 × 106 cells ml?1, while a logarithmic correlation was found between non-lysed dead cell density and conductivity up to 107 cells ml?1. Additionally, partial least square technique was used to develop a calibration model of the supernatant LDH activity based on online NIR spectra with a RMSEC of 55 U l?1. Considering the LDH content of viable cells measured to be 110 U per 109 cells, the lysed dead cell density could be then estimated. These calibration models provided real-time prediction accuracy (R2 ≥ 0.95) for the three types of cell populations.

Conclusion

The high potential of a dual spectroscopy strategy to enhance the online bioprocesses characterization is demonstrated since it allows the simultaneous determination of viable, dead and lysed cell populations in real time.
  相似文献   

12.

Key message

A repertoire of the genomic regions involved in quantitative resistance to Leptosphaeria maculans in winter oilseed rape was established from combined linkage-based QTL and genome-wide association (GWA) mapping.

Abstract

Linkage-based mapping of quantitative trait loci (QTL) and genome-wide association studies are complementary approaches for deciphering the genomic architecture of complex agronomical traits. In oilseed rape, quantitative resistance to blackleg disease, caused by L. maculans, is highly polygenic and is greatly influenced by the environment. In this study, we took advantage of multi-year data available on three segregating populations derived from the resistant cv Darmor and multi-year data available on oilseed rape panels to obtain a wide overview of the genomic regions involved in quantitative resistance to this pathogen in oilseed rape. Sixteen QTL regions were common to at least two biparental populations, of which nine were the same as previously detected regions in a multi-parental design derived from different resistant parents. Eight regions were significantly associated with quantitative resistance, of which five on A06, A08, A09, C01 and C04 were located within QTL support intervals. Homoeologous Brassica napus genes were found in eight homoeologous QTL regions, which corresponded to 657 pairs of homoeologous genes. Potential candidate genes underlying this quantitative resistance were identified. Genomic predictions and breeding are also discussed, taking into account the highly polygenic nature of this resistance.
  相似文献   

13.
An emerging insight in invasion biology is that intra-specific genetic variation, human usage, and introduction histories interact to shape genetic diversity and its distribution in populations of invasive species. We explore these aspects for the tree species Paraserianthes lophantha subsp. lophantha, a close relative of Australian wattles (genus Acacia). This species is native to Western Australia and is invasive in a number of regions globally. Using microsatellite genotype and DNA sequencing data, we show that native Western Australian populations of P. lophantha subsp. lophantha are geographically structured and are more diverse than introduced populations in Australia (New South Wales, South Australia, and Victoria), the Hawaiian Islands, Portugal, and South Africa. Introduced populations varied greatly in the amount of genetic diversity contained within them, from being low (e.g. Portugal) to high (e.g. Maui, Hawaiian Islands). Irrespective of provenance (native or introduced), all populations appeared to be highly inbred (F IS ranging from 0.55 to 0.8), probably due to selfing. Although introduced populations generally had lower genetic diversity than native populations, Bayesian clustering of microsatellites and phylogenetic diversity indicated that introduced populations comprise a diverse array of genotypes, most of which were also identified in Western Australia. The dissimilarity in the distribution and number of genotypes in introduced regions suggests that non-native populations originated from different native sources and that introduction events differed in propagule pressure.  相似文献   

14.

Key message

Genomic selection shows great promise for pre-selecting lines with superior bread baking quality in early generations, 3 years ahead of labour-intensive, time-consuming, and costly quality analysis.

Abstract

The genetic improvement of baking quality is one of the grand challenges in wheat breeding as the assessment of the associated traits often involves time-consuming, labour-intensive, and costly testing forcing breeders to postpone sophisticated quality tests to the very last phases of variety development. The prospect of genomic selection for complex traits like grain yield has been shown in numerous studies, and might thus be also an interesting method to select for baking quality traits. Hence, we focused in this study on the accuracy of genomic selection for laborious and expensive to phenotype quality traits as well as its selection response in comparison with phenotypic selection. More than 400 genotyped wheat lines were, therefore, phenotyped for protein content, dough viscoelastic and mixing properties related to baking quality in multi-environment trials 2009–2016. The average prediction accuracy across three independent validation populations was r = 0.39 and could be increased to r = 0.47 by modelling major QTL as fixed effects as well as employing multi-trait prediction models, which resulted in an acceptable prediction accuracy for all dough rheological traits (r = 0.38–0.63). Genomic selection can furthermore be applied 2–3 years earlier than direct phenotypic selection, and the estimated selection response was nearly twice as high in comparison with indirect selection by protein content for baking quality related traits. This considerable advantage of genomic selection could accordingly support breeders in their selection decisions and aid in efficiently combining superior baking quality with grain yield in newly developed wheat varieties.
  相似文献   

15.
The efficiency of marker-assisted prediction of phenotypes has been studied intensively for different types of plant breeding populations. However, one remaining question is how to incorporate and counterbalance information from biparental and multiparental populations into model training for genome-wide prediction. To address this question, we evaluated testcross performance of 1652 doubled-haploid maize (Zea mays L.) lines that were genotyped with 56,110 single nucleotide polymorphism markers and phenotyped for five agronomic traits in four to six European environments. The lines are arranged in two diverse half-sib panels representing two major European heterotic germplasm pools. The data set contains 10 related biparental dent families and 11 related biparental flint families generated from crosses of maize lines important for European maize breeding. With this new data set we analyzed genome-based best linear unbiased prediction in different validation schemes and compositions of estimation and test sets. Further, we theoretically and empirically investigated marker linkage phases across multiparental populations. In general, predictive abilities similar to or higher than those within biparental families could be achieved by combining several half-sib families in the estimation set. For the majority of families, 375 half-sib lines in the estimation set were sufficient to reach the same predictive performance of biomass yield as an estimation set of 50 full-sib lines. In contrast, prediction across heterotic pools was not possible for most cases. Our findings are important for experimental design in genome-based prediction as they provide guidelines for the genetic structure and required sample size of data sets used for model training.  相似文献   

16.
17.

Key message

The RTM-GWAS was chosen among five procedures to identify DTF QTL-allele constitution in a soybean NAM population; 139 QTLs with 496 alleles accounting for 81.7% of phenotypic variance were detected.

Abstract

Flowering date (days to flowering, DTF) is an ecological trait in soybean, closely related to its ability to adapt to areas. A nested association mapping (NAM) population consisting of four RIL populations (LM, ZM, MT and MW with M8206 as their common parent) was established and tested for their DTF under five environments. Using restriction-site-associated DNA sequencing the population was genotyped with SNP markers. The restricted two-stage multi-locus (RTM) genome-wide association study (GWAS) (RTM-GWAS) with SNP linkage disequilibrium block (SNPLDB) as multi-allele genomic markers performed the best among the five mapping procedures with software publicly available. It identified the greatest number of quantitative trait loci (QTLs) (139) and alleles (496) on 20 chromosomes covering almost all of the QTLs detected by four other mapping procedures. The RTM-GWAS provided the detected QTLs with highest genetic contribution but without overflowing and missing heritability problems (81.7% genetic contribution vs. heritability of 97.6%), while SNPLDB markers matched the NAM population property of multiple alleles per locus. The 139 QTLs with 496 alleles were organized into a QTL-allele matrix, showing the corresponding DTF genetic architecture of the five parents and the NAM population. All lines and parents comprised both positive and negative alleles, implying a great potential of recombination for early and late DTF improvement. From the detected QTL-allele system, 126 candidate genes were annotated and χ 2 tested as a DTF candidate gene system involving nine biological processes, indicating the trait a complex, involving several biological processes rather than only a handful of major genes.
  相似文献   

18.

Background

Protein residue-residue contact prediction is important for protein model generation and model evaluation. Here we develop a conformation ensemble approach to improve residue-residue contact prediction. We collect a number of structural models stemming from a variety of methods and implementations. The various models capture slightly different conformations and contain complementary information which can be pooled together to capture recurrent, and therefore more likely, residue-residue contacts.

Results

We applied our conformation ensemble approach to free modeling targets from both CASP8 and CASP9. Given a diverse ensemble of models, the method is able to achieve accuracies of. 48 for the top L/5 medium range contacts and. 36 for the top L/5 long range contacts for CASP8 targets (L being the target domain length). When applied to targets from CASP9, the accuracies of the top L/5 medium and long range contact predictions were. 34 and. 30 respectively.

Conclusions

When operating on a moderately diverse ensemble of models, the conformation ensemble approach is an effective means to identify medium and long range residue-residue contacts. An immediate benefit of the method is that when tied with a scoring scheme, it can be used to successfully rank models.  相似文献   

19.

Background

In contrast to currently used single nucleotide polymorphism (SNP) panels, the use of whole-genome sequence data is expected to enable the direct estimation of the effects of causal mutations on a given trait. This could lead to higher reliabilities of genomic predictions compared to those based on SNP genotypes. Also, at each generation of selection, recombination events between a SNP and a mutation can cause decay in reliability of genomic predictions based on markers rather than on the causal variants. Our objective was to investigate the use of imputed whole-genome sequence genotypes versus high-density SNP genotypes on (the persistency of) the reliability of genomic predictions using real cattle data.

Methods

Highly accurate phenotypes based on daughter performance and Illumina BovineHD Beadchip genotypes were available for 5503 Holstein Friesian bulls. The BovineHD genotypes (631,428 SNPs) of each bull were used to impute whole-genome sequence genotypes (12,590,056 SNPs) using the Beagle software. Imputation was done using a multi-breed reference panel of 429 sequenced individuals. Genomic estimated breeding values for three traits were predicted using a Bayesian stochastic search variable selection (BSSVS) model and a genome-enabled best linear unbiased prediction model (GBLUP). Reliabilities of predictions were based on 2087 validation bulls, while the other 3416 bulls were used for training.

Results

Prediction reliabilities ranged from 0.37 to 0.52. BSSVS performed better than GBLUP in all cases. Reliabilities of genomic predictions were slightly lower with imputed sequence data than with BovineHD chip data. Also, the reliabilities tended to be lower for both sequence data and BovineHD chip data when relationships between training animals were low. No increase in persistency of prediction reliability using imputed sequence data was observed.

Conclusions

Compared to BovineHD genotype data, using imputed sequence data for genomic prediction produced no advantage. To investigate the putative advantage of genomic prediction using (imputed) sequence data, a training set with a larger number of individuals that are distantly related to each other and genomic prediction models that incorporate biological information on the SNPs or that apply stricter SNP pre-selection should be considered.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0149-x) contains supplementary material, which is available to authorized users.  相似文献   

20.
Factors promoting the invasion success of introduced populations have been receiving increased attention in studies of biological invasions. Previous reports have indicated that successful invasions may be attributable to reduced genetic diversity in the invasive species. However, there is large variation in the magnitude and direction of the impact of exotic species that have remained unexplained. Here, we present a structured meta-analysis of papers investigating the genetic diversity of native and introduced populations of exotic insects using nuclear microsatellites and mitochondrial DNA sequences. The results indicate that invasion by exotic insects had an overall reducing effect on the genetic diversity of the invading population, with nonzero effect sizes for the number of alleles (NA), observed heterozygosity (Ho), expected heterozygosity (He) and nucleotide diversity (Nd). However, when analyzing different orders (e.g., Lepidoptera, Hemiptera), the effect sizes of NA, Ho and Nd in Lepidoptera were found to bracket zero, as did the effect size of He in Hemiptera. These results suggest an asymmetric reduction in the genetic diversity of introduced populations of exotic insects, indicating diverse mechanisms underlying their successful invasion.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号