首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background

Genomic prediction requires estimation of variances of effects of single nucleotide polymorphisms (SNPs), which is computationally demanding, and uses these variances for prediction. We have developed models with separate estimation of SNP variances, which can be applied infrequently, and genomic prediction, which can be applied routinely.

Methods

SNP variances were estimated with Bayes Stochastic Search Variable Selection (BSSVS) and BayesC. Genome-enhanced breeding values (GEBV) were estimated with RR-BLUP (ridge regression best linear unbiased prediction), using either variances obtained from BSSVS (BLUP-SSVS) or BayesC (BLUP-C), or assuming equal variances for each SNP. Datasets used to estimate SNP variances comprised (1) all animals, (2) 50% random animals (RAN50), (3) 50% best animals (TOP50), or (4) 50% worst animals (BOT50). Traits analysed were protein yield, udder depth, somatic cell score, interval between first and last insemination, direct longevity, and longevity including information from predictors.

Results

BLUP-SSVS and BLUP-C yielded similar GEBV as the equivalent Bayesian models that simultaneously estimated SNP variances. Reliabilities of these GEBV were consistently higher than from RR-BLUP, although only significantly for direct longevity. Across scenarios that used data subsets to estimate GEBV, observed reliabilities were generally higher for TOP50 than for RAN50, and much higher than for BOT50. Reliabilities of TOP50 were higher because the training data contained more ancestors of selection candidates. Using estimated SNP variances based on random or non-random subsets of the data, while using all data to estimate GEBV, did not affect reliabilities of the BLUP models. A convergence criterion of 10−8 instead of 10−10 for BLUP models yielded similar GEBV, while the required number of iterations decreased by 71 to 90%. Including a separate polygenic effect consistently improved reliabilities of the GEBV, but also substantially increased the required number of iterations to reach convergence with RR-BLUP. SNP variances converged faster for BayesC than for BSSVS.

Conclusions

Combining Bayesian variable selection models to re-estimate SNP variances and BLUP models that use those SNP variances, yields GEBV that are similar to those from full Bayesian models. Moreover, these combined models yield predictions with higher reliability and less bias than the commonly used RR-BLUP model.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-014-0052-x) contains supplementary material, which is available to authorized users.  相似文献   

2.
3.
Accuracy of genomic breeding values in multi-breed dairy cattle populations   总被引:1,自引:0,他引:1  

Background

Two key findings from genomic selection experiments are 1) the reference population used must be very large to subsequently predict accurate genomic estimated breeding values (GEBV), and 2) prediction equations derived in one breed do not predict accurate GEBV when applied to other breeds. Both findings are a problem for breeds where the number of individuals in the reference population is limited. A multi-breed reference population is a potential solution, and here we investigate the accuracies of GEBV in Holstein dairy cattle and Jersey dairy cattle when the reference population is single breed or multi-breed. The accuracies were obtained both as a function of elements of the inverse coefficient matrix and from the realised accuracies of GEBV.

Methods

Best linear unbiased prediction with a multi-breed genomic relationship matrix (GBLUP) and two Bayesian methods (BAYESA and BAYES_SSVS) which estimate individual SNP effects were used to predict GEBV for 400 and 77 young Holstein and Jersey bulls respectively, from a reference population of 781 and 287 Holstein and Jersey bulls, respectively. Genotypes of 39,048 SNP markers were used. Phenotypes in the reference population were de-regressed breeding values for production traits. For the GBLUP method, expected accuracies calculated from the diagonal of the inverse of coefficient matrix were compared to realised accuracies.

Results

When GBLUP was used, expected accuracies from a function of elements of the inverse coefficient matrix agreed reasonably well with realised accuracies calculated from the correlation between GEBV and EBV in single breed populations, but not in multi-breed populations. When the Bayesian methods were used, realised accuracies of GEBV were up to 13% higher when the multi-breed reference population was used than when a pure breed reference was used. However no consistent increase in accuracy across traits was obtained.

Conclusion

Predicting genomic breeding values using a genomic relationship matrix is an attractive approach to implement genomic selection as expected accuracies of GEBV can be readily derived. However in multi-breed populations, Bayesian approaches give higher accuracies for some traits. Finally, multi-breed reference populations will be a valuable resource to fine map QTL.  相似文献   

4.
A sampling-based method for estimating the accuracy of estimated breeding values using an animal model is presented. Empirical variances of true and estimated breeding values were estimated from a simulated n-sample. The method was validated using a small data set from the Parthenaise breed with the estimated coefficient of determination converging to the true values. It was applied to the French Salers data file used for the 2000 on-farm evaluation (IBOVAL) of muscle development score. A drawback of the method is its computational demand. Consequently, convergence can not be achieved in a reasonable time for very large data files. Two advantages of the method are that a) it is applicable to any model (animal, sire, multivariate, maternal effects...) and b) it supplies off-diagonal coefficients of the inverse of the mixed model equations and can therefore be the basis of connectedness studies.  相似文献   

5.
High-throughput SNP genotyping by single-tube PCR with Tm-shift primers   总被引:5,自引:0,他引:5  
Despite many recent advances in high-throughput single nucleotide polymorphism (SNP) genotyping technologies, there is still a great need for inexpensive and flexible methods with a reasonable throughput. Here we report substantial modifications and improvements to an existing homogenous allele-specific PCR-based SNP genotyping method, making it an attractive new option for researchers engaging in candidate gene studies or following up on genome-wide scans. In this advanced version of the melting temperature (Tm)-shift SNP genotyping method, we attach two GC-rich tails of different lengths to allele-specific PCR primers, such that SNP alleles in genomic DNA samples can be discriminated by the Tms of the PCR products. We have validated 306 SNP assays using this method and achieved a success rate in assay development of greater than 83% under uniform PCR conditions. We have developed a standalone software application to automatically assign genotypes directly from melting curve data. To demonstrate the accuracy of this method, we typed 592 individuals for 6 SNPs and showed a high call rate (>98%) and high accuracy (>99.9%). With this method, 6-10,000 samples can be genotyped per day using a single 384-well real-time thermal cycler with 2-4 standard 384-well PCR instruments.  相似文献   

6.
Lstiburek M  Mullin TJ  Mackay TF  Huber D  Li B 《Genetics》2005,171(3):1311-1320
While other investigations have described benefits of positive assortative mating (PAM) for forest tree breeding, the allocation of resources among mates in these studies was either equal or varied, using schemes corresponding only to parental rank (i.e., more resources invested in higher-ranking parents). In this simulation study, family sizes were proportional to predicted midparent BLUP values. The distribution of midparent BLUP values was standardized by a constant, which was varied to study the range of distributions of family size. Redistributing progenies from lower- to higher-ranking families to a point where an equal number of progenies were still selected out of each family to the next generation caused minimal change in group coancestry and inbreeding in the breeding population (BP), while the additive genetic response and variance in the BP were both greatly enhanced. This generated additional genetic gains for forest plantations by selecting more superior genotypes from the BP (compared to PAM with equal family sizes) for production of improved regeneration materials. These conclusions were verified for a range of heritability under a polygenic model and under a mixed-inheritance model with a QTL contributing to the trait variation.  相似文献   

7.
We use high-density single nucleotide polymorphism (SNP) genotyping microarrays to demonstrate the ability to accurately and robustly determine whether individuals are in a complex genomic DNA mixture. We first develop a theoretical framework for detecting an individual's presence within a mixture, then show, through simulations, the limits associated with our method, and finally demonstrate experimentally the identification of the presence of genomic DNA of specific individuals within a series of highly complex genomic mixtures, including mixtures where an individual contributes less than 0.1% of the total genomic DNA. These findings shift the perceived utility of SNPs for identifying individual trace contributors within a forensics mixture, and suggest future research efforts into assessing the viability of previously sub-optimal DNA sources due to sample contamination. These findings also suggest that composite statistics across cohorts, such as allele frequency or genotype counts, do not mask identity within genome-wide association studies. The implications of these findings are discussed.  相似文献   

8.
A total of 63 isolates were screened for the gyrA mutation (87Asp-Tyr) in Salmonella enterica serovars using real time PCR. All of the isolates were successfully identified as resistant or susceptible, consistent with the MIC result of the agar dilution method and gyrA sequencing.  相似文献   

9.

Background

At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI).

Methods

Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length.

Results

RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls.

Conclusions

Accurate genomic evaluation of the broader bull and cow population can be achieved with a single genotyping assays containing ~ 3,000 to 5,000 evenly spaced SNP.  相似文献   

10.
A photocleavable o-nitrobenzyl CE phosphoramidite building-block was synthesised and incorporated within oligonucleotides. After allele-specific primer extension, desalting was performed using genostrep purification plates. Release of the SNP information containing part through photocleavage created shortened molecules that are easily accessible for MALDI-TOF analysis. Additionally, incorporation of mass modified nucleosides enables flexible design of multiplex genotyping.  相似文献   

11.
Oilseed rape (Brassica napus) is an allotetraploid species consisting of two genomes, derived from B. rapa (A genome) and B. oleracea (C genome). The presence of these two genomes makes single nucleotide polymorphism (SNP) marker identification and SNP analysis more challenging than in diploid species, as for a given locus usually two versions of a DNA sequence (based on the two ancestral genomes) have to be analyzed simultaneously during SNP identification and analysis. One hundred amplicons derived from expressed sequence tag (ESTs) were analyzed to identify SNPs in a panel of oilseed rape varieties and within two sister species representing the ancestral genomes. A total of 604 SNPs were identified, averaging one SNP in every 42 bp. It was possible to clearly discriminate SNPs that are polymorphic between different plant varieties from SNPs differentiating the two ancestral genomes. To validate the identified SNPs for their use in genetic analysis, we have developed Illumina GoldenGate assays for some of the identified SNPs. Through the analysis of a number of oilseed rape varieties and mapping populations with GoldenGate assays, we were able to identify a number of different segregation patterns in allotetraploid oilseed rape. The majority of the identified SNP markers can be readily used for genetic mapping, showing that amplicon sequencing and Illumina GoldenGate assays can be used to reliably identify SNP markers in tetraploid oilseed rape and to convert them into successful SNP assays that can be used for genetic analysis.  相似文献   

12.
Genotyping by sequencing (GBS) is the latest application of next-generation sequencing protocols for the purposes of discovering and genotyping SNPs in a variety of crop species and populations. Unlike other high-density genotyping technologies which have mainly been applied to general interest “reference” genomes, the low cost of GBS makes it an attractive means of saturating mapping and breeding populations with a high density of SNP markers. One barrier to the widespread use of GBS has been the difficulty of the bioinformatics analysis as the approach is accompanied by a high number of erroneous SNP calls which are not easily diagnosed or corrected. In this study, we use a 384-plex GBS protocol to add 30,984 markers to an indica (IR64) × japonica (Azucena) mapping population consisting of 176 recombinant inbred lines of rice (Oryza sativa) and we release our imputation and error correction pipeline to address initial GBS data sparsity and error, and streamline the process of adding SNPs to RIL populations. Using the final imputed and corrected dataset of 30,984 markers, we were able to map recombination hot and cold spots and regions of segregation distortion across the genome with a high degree of accuracy, thus identifying regions of the genome containing putative sterility loci. We mapped QTL for leaf width and aluminum tolerance, and were able to identify additional QTL for both phenotypes when using the full set of 30,984 SNPs that were not identified using a subset of only 1,464 SNPs, including a previously unreported QTL for aluminum tolerance located directly within a recombination hotspot on chromosome 1. These results suggest that adding a high density of SNP markers to a mapping or breeding population through GBS has a great value for numerous applications in rice breeding and genetics research.  相似文献   

13.
Molecular Biology Reports - Single nucleotide polymorphisms (SNPs) are the main type of variation in genome, enabling them to be associated with traits of economic importance in livestock....  相似文献   

14.
单核苷酸多态性基因分型技术原理与进展   总被引:5,自引:0,他引:5  
在基因组规模了解遗传变异与生物功能之间的关系可望为生物学带来全新的深入认识。本从等位基因分型机理、反应形式和检测方法等三个方面讨论SNP分型方法的现状,并简要介绍了目前应用的一些分型方法。  相似文献   

15.
We describe a method for the efficient genotyping of SNPs, involving sequencing of ordered and catenated sequence-tagged sites (OCS). In OCS, short genomic segments, each containing an SNP, are amplified by PCR using primers that carry specially designed extra nucleotides at their 5′-ends. Amplification products are then combined and converted to a concatamer in a defined order by a second round of thermal cycling. The concatenation takes place because the 5′-ends of each amplicon are designed to be complementary to the ends of the presumptive neighboring amplicons. The primer sequences for OCS are chosen using newly developed dedicated software, OCS Optimizer. Using sets of SNPs, we show that at least 10 STSs can be concatenated in a predefined order and all SNPs in the STSs are accurately genotyped by one two-way sequencing reaction.  相似文献   

16.
An equivalent model for multibreed variance covariance estimation is presented. It considers the additive case including or not the segregation variances. The model is based on splitting the additive genetic values in several independent parts depending on their genetic origin. For each part, it expresses the covariance between relatives as a partial numerator relationship matrix times the corresponding variance component. Estimation of fixed effects, random effects or variance components provided by the model are as simple as any model including several random factors. We present a small example describing the mixed model equations for genetic evaluations and two simulated examples to illustrate the Bayesian variance component estimation.  相似文献   

17.
The synthesis of positively charged and mass tagged nucleosides containing a quaternary ammonium functionality within the penultimate position of a primer is described. Neutralization of the sugar/thiophosphate backbone by alkylation increases the detection sensitivity in the mass spectrometric analysis by a factor of at least 100. The variable introduction of these novel compounds within the extension primers enables flexible design of multiplex genotyping reactions.  相似文献   

18.
Holstein Friesian cow training sets were created according to disease incidences. The different datasets were used to investigate the impact of random forest (RF) and genomic BLUP (GBLUP) methodology on genomic prediction accuracies. In addition, for further verifications of some specific scenarios, single‐step genomic BLUP was applied. Disease traits included the overall trait categories of (i) claw disorders, (ii) clinical mastitis and (iii) infertility from 80 741 first lactation Holstein cows kept in 58 large‐scale herds. A subset of 6744 cows was genotyped (50K SNP panel). Response variables for all scenarios were de‐regressed proofs (DRPs) and pre‐corrected phenotypes (PCPs). Initially, all sick cows were allocated to the testing set, and healthy cows represented the training set. For the ongoing cow allocation schemes, the number of sick cows in the training set increased stepwise by moving 10% of the sick cows from the testing to the training set in each step. The size of training and testing sets was kept constant by replacing the same number of cows in the testing set with (randomly selected) healthy cows from the training set. For both the RF and GBLUP methods, prediction accuracies were larger for DRPs compared to PCPs. For PCPs as a response variable, the largest prediction accuracies were observed when the disease incidences in training sets reflected the disease incidence in the whole population. A further increase in prediction accuracies for some selected cow allocation schemes (i.e. larger prediction accuracies compared to corresponding scenarios with RF or GBLUB) was achieved via single‐step GBLUP applications. Correlations between genome‐wide association study SNP effects and RF importance criteria for single SNPs were in a moderate range, from 0.42 to 0.57, when considering SNPs from all chromosomes or from specific chromosome segments. RF identified significant SNPs close to potential positional candidate genes: GAS1, GPAT3 and CYP2R1 for clinical mastitis; SPINK5 and SLC26A2 for laminitis; and FGF12 for endometritis.  相似文献   

19.

Background

Genomic selection (GS) uses molecular breeding values (MBV) derived from dense markers across the entire genome for selection of young animals. The accuracy of MBV prediction is important for a successful application of GS. Recently, several methods have been proposed to estimate MBV. Initial simulation studies have shown that these methods can accurately predict MBV. In this study we compared the accuracies and possible bias of five different regression methods in an empirical application in dairy cattle.

Methods

Genotypes of 7,372 SNP and highly accurate EBV of 1,945 dairy bulls were used to predict MBV for protein percentage (PPT) and a profit index (Australian Selection Index, ASI). Marker effects were estimated by least squares regression (FR-LS), Bayesian regression (Bayes-R), random regression best linear unbiased prediction (RR-BLUP), partial least squares regression (PLSR) and nonparametric support vector regression (SVR) in a training set of 1,239 bulls. Accuracy and bias of MBV prediction were calculated from cross-validation of the training set and tested against a test team of 706 young bulls.

Results

For both traits, FR-LS using a subset of SNP was significantly less accurate than all other methods which used all SNP. Accuracies obtained by Bayes-R, RR-BLUP, PLSR and SVR were very similar for ASI (0.39-0.45) and for PPT (0.55-0.61). Overall, SVR gave the highest accuracy.All methods resulted in biased MBV predictions for ASI, for PPT only RR-BLUP and SVR predictions were unbiased. A significant decrease in accuracy of prediction of ASI was seen in young test cohorts of bulls compared to the accuracy derived from cross-validation of the training set. This reduction was not apparent for PPT. Combining MBV predictions with pedigree based predictions gave 1.05 - 1.34 times higher accuracies compared to predictions based on pedigree alone. Some methods have largely different computational requirements, with PLSR and RR-BLUP requiring the least computing time.

Conclusions

The four methods which use information from all SNP namely RR-BLUP, Bayes-R, PLSR and SVR generate similar accuracies of MBV prediction for genomic selection, and their use in the selection of immediate future generations in dairy cattle will be comparable. The use of FR-LS in genomic selection is not recommended.  相似文献   

20.

Background

A newly recognized type of genetic variation, Copy Number Variation (CNV), is detected in mammalian genomes, e.g. the cattle genome. This form of variation can potentially cause phenotypic variation. Our objective was to determine whether dense SNP (single nucleotide polymorphisms) panels can capture the genetic variation due to a simple bi-allelic CNV, with the prospect of including the effect of such structural variations into genomic predictions.

Methods

A deletion type CNV on bovine chromosome 6 was predicted from its neighboring SNP with a multiple regression model. Our dataset consisted of CNV genotypes of 1,682 cows, along with 100 surrounding SNP genotypes. A prediction model was fitted considering 10 to 100 surrounding SNP and the accuracy obtained directly from the model was confirmed by cross-validation.

Results and conclusions

The accuracy of prediction increased with an increasing number of SNP in the model and the predicted accuracies were similar to those obtained by cross-validation. A substantial increase in accuracy was observed when the number of SNP increased from 10 to 50 but thereafter the increase was smaller, reaching the highest accuracy (0.94) with 100 surrounding SNP. Thus, we conclude that the genotype of a deletion type CNV and its putative QTL effect can be predicted with a maximum accuracy of 0.94 from surrounding SNP. This high prediction accuracy suggests that genetic variation due to simple deletion CNV is well captured by dense SNP panels. Since genomic selection relies on the availability of a dense marker panel with markers in close linkage disequilibrium to the QTL in order to predict their genetic values, we also discuss opportunities for genomic selection to predict the effects of CNV by dense SNP panels, when CNV cause variation in quantitative traits.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号