首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The use of Gibbs sampling in making decisions about the optimal selection environment was demonstrated. Marginal posterior distributions of the efficiency of selection across sites were obtained using the Gibbs sampler, a Bayesian method, from which the probability that the efficiency of selection lay between specified values and the variance of the distribution were computed, providing a lot of information on which to make decisions regarding the location of genetic tests. The heritability, genetic correlations and efficiencies of selection estimated using REML and Gibbs sampling were similar. However, the latter approach showed that the point estimates of the efficiencies of selection were subject to substantial error. The decision regarding selection at maturity was consistent with that obtained using point estimates from REML, but Gibbs sampling allowed the efficiencies of selection to be interpreted with more confidence. The decision regarding early selection differed from that based on REML point estimates. Generally, the decisions to make early selections at site B for planting at both site B and A, and to make selections at maturity at each individual site, were robust to different priors in the Gibbs sampling. Received: 19 June 2000 / Accepted: 18 October 2000  相似文献   

2.
An increased availability of genotypes at marker loci has prompted the development of models that include the effect of individual genes. Selection based on these models is known as marker-assisted selection (MAS). MAS is known to be efficient especially for traits that have low heritability and non-additive gene action. BLUP methodology under non-additive gene action is not feasible for large inbred or crossbred pedigrees. It is easy to incorporate non-additive gene action in a finite locus model. Under such a model, the unobservable genotypic values can be predicted using the conditional mean of the genotypic values given the data. To compute this conditional mean, conditional genotype probabilities must be computed. In this study these probabilities were computed using iterative peeling, and three Markov chain Monte Carlo (MCMC) methods – scalar Gibbs, blocking Gibbs, and a sampler that combines the Elston Stewart algorithm with iterative peeling (ESIP). The performance of these four methods was assessed using simulated data. For pedigrees with loops, iterative peeling fails to provide accurate genotype probability estimates for some pedigree members. Also, computing time is exponentially related to the number of loci in the model. For MCMC methods, a linear relationship can be maintained by sampling genotypes one locus at a time. Out of the three MCMC methods considered, ESIP, performed the best while scalar Gibbs performed the worst.  相似文献   

3.

Key message

Compared with independent validation, cross-validation simultaneously sampling genotypes and environments provided similar estimates of accuracy for genomic selection, but inflated estimates for marker-assisted selection.

Abstract

Estimates of prediction accuracy of marker-assisted (MAS) and genomic selection (GS) require validations. The main goal of our study was to compare the prediction accuracies of MAS and GS validated in an independent sample with results obtained from fivefold cross-validation using genomic and phenotypic data for Fusarium head blight resistance in wheat. In addition, the applicability of the reliability criterion, a concept originally developed in the context of classic animal breeding and GS, was explored for MAS. We observed that prediction accuracies of MAS were overestimated by 127% using cross-validation sampling genotype and environments in contrast to independent validation. In contrast, prediction accuracies of GS determined in independent samples are similar to those estimated with cross-validation sampling genotype and environments. This can be explained by small population differentiation between the training and validation sets in our study. For European wheat breeding, which is so far characterized by a slow temporal dynamic in allele frequencies, this assumption seems to be realistic. Thus, GS models used to improve European wheat populations are expected to possess a long-lasting validity. Since quantitative trait loci information can be exploited more precisely if the predicted genotype is more related to the training population, the reliability criterion is also a valuable tool to judge the level of prediction accuracy of individual genotypes in MAS.
  相似文献   

4.
Genomic selection (GS) is of interest in breeding because of its potential for predicting the genetic value of individuals and increasing genetic gains per unit of time. To date, very few studies have reported empirical results of GS potential in the context of large population sizes and long breeding cycles such as for boreal trees. In this study, we assessed the effectiveness of marker-aided selection in an undomesticated white spruce (Picea glauca (Moench) Voss) population of large effective size using a GS approach. A discovery population of 1694 trees representative of 214 open-pollinated families from 43 natural populations was phenotyped for 12 wood and growth traits and genotyped for 6385 single-nucleotide polymorphisms (SNPs) mined in 2660 gene sequences. GS models were built to predict estimated breeding values using all the available SNPs or SNP subsets of the largest absolute effects, and they were validated using various cross-validation schemes. The accuracy of genomic estimated breeding values (GEBVs) varied from 0.327 to 0.435 when the training and the validation data sets shared half-sibs that were on average 90% of the accuracies achieved through traditionally estimated breeding values. The trend was also the same for validation across sites. As expected, the accuracy of GEBVs obtained after cross-validation with individuals of unknown relatedness was lower with about half of the accuracy achieved when half-sibs were present. We showed that with the marker densities used in the current study, predictions with low to moderate accuracy could be obtained within a large undomesticated population of related individuals, potentially resulting in larger gains per unit of time with GS than with the traditional approach.  相似文献   

5.
The paper presents a method of multivariate data analysis described by a model which involves fixed effects, additive polygenic individual effects and the effects of a major gene. To find the estimates of model parameters, the maximization of likelihood function method is applied. The maximum of likelihood function is computed by the use of the Gibbs sampling approach. In this approach, following the conditional posterior distributions, values of all unknown parameters are generated. On the basis of the obtained samples the marginal posterior densities as well as the estimates of fixed effects, gene frequency, genotypic values, major gene, polygenic and error (co)variances are calculated. A numerical example, supplemented to theoretical considerations, deals with data simulated according to the considered model.  相似文献   

6.
The presence of major genes affecting rust resistance of loblolly pine was investigated in a progeny population that was generated with a half-diallel mating of six parents. A Bayesian complex segregation analysis was used to make inference about a mixed inheritance model (MIM) that included polygenic effects and a single major gene effect. Marginalizations were achieved by using Gibbs sampler. A parent block sampling by which genotypes of a parent and its offspring were sampled jointly was implemented to improve mixing. The MIM was compared with a pure polygenic model (PM) using Bayes factor. Results showed that the MIM was a better model to explain the inheritance of rust resistance than the pure PM in the diallel population. A large major gene variance component estimate (> 50% of total variance), indicated the existence of major genes for rust resistance in the studied loblolly pine population. Based on estimations of parental genotypes, it appears that there may be two or more major genes affecting disease phenotypes in this diallel population.  相似文献   

7.
Zeng W  Ghosh S  Li B 《Genetical research》2004,83(2):143-154
Diallel mating is a frequently used design for estimating the additive and dominance genetic (polygenic) effects involved in quantitative traits observed in the half- and full-sib progenies generated in plant breeding programmes. Gibbs sampling has been used for making statistical inferences for a mixed-inheritance model (MIM) that includes both major genes and polygenes. However, using this approach it has not been possible to incorporate the genetic properties of major genes with the additive and dominance polygenic effects in a diallel mating population. A parent block Gibbs sampling method was developed in this study to make statistical inferences about the major gene and polygenic effects on quantitative traits for progenies derived from a half-diallel mating design. Using simulated data sets with different major and polygenic effects, the proposed method accurately estimated the major and polygenic effects of quantitative traits, and possible genotypes of parents and progenies. The impact of specifying different prior distributions was examined and was found to have little effect on inference on the posterior distribution. This approach was applied to an experimental data set of Loblolly pine (Pinus taeda L.) derived from a 6-parent half-diallel mating. The result indicated that there might be a recessive major gene affecting height growth in this diallel population.  相似文献   

8.
Keightley PD  Halligan DL 《Genetics》2011,188(4):931-940
Sequencing errors and random sampling of nucleotide types among sequencing reads at heterozygous sites present challenges for accurate, unbiased inference of single-nucleotide polymorphism genotypes from high-throughput sequence data. Here, we develop a maximum-likelihood approach to estimate the frequency distribution of the number of alleles in a sample of individuals (the site frequency spectrum), using high-throughput sequence data. Our method assumes binomial sampling of nucleotide types in heterozygotes and random sequencing error. By simulations, we show that close to unbiased estimates of the site frequency spectrum can be obtained if the error rate per base read does not exceed the population nucleotide diversity. We also show that these estimates are reasonably robust if errors are nonrandom. We then apply the method to infer site frequency spectra for zerofold degenerate, fourfold degenerate, and intronic sites of protein-coding genes using the low coverage human sequence data produced by the 1000 Genomes Project phase-one pilot. By fitting a model to the inferred site frequency spectra that estimates parameters of the distribution of fitness effects of new mutations, we find evidence for significant natural selection operating on fourfold sites. We also find that a model with variable effects of mutations at synonymous sites fits the data significantly better than a model with equal mutational effects. Under the variable effects model, we infer that 11% of synonymous mutations are subject to strong purifying selection.  相似文献   

9.
The application of Gibbs sampling is considered for inference in a mixed inheritance model in animal populations. Implementation of the Gibbs sampler on scalar components, as used for human populations, appeared not to be efficient, and an approach with blockwise sampling of genotypes was proposed for use in animal populations. The blockwise sampling of genotypes was proposed for use in animal populations. The blockwise sampling by which genotypes of a sire and its final progeny were sampled jointly was effective in improving mixing, although further improvements could be looked for. Posterior densities of parameters were visualised from Gibbs samples; from the former highly marginalised Bayesian point and interval estimates can be obtained.  相似文献   

10.
In a previous contribution, we implemented a finite locus model (FLM) for estimating additive and dominance genetic variances via a Bayesian method and a single-site Gibbs sampler. We observed a dependency of dominance variance estimates on locus number in the analysis FLM. Here, we extended the FLM to include two-locus epistasis, and implemented the analysis with two genotype samplers (Gibbs and descent graph) and three different priors for genetic effects (uniform and variable across loci, uniform and constant across loci, and normal). Phenotypic data were simulated for two pedigrees with 6300 and 12,300 individuals in closed populations, using several different, non-additive genetic models. Replications of these data were analysed with FLMs differing in the number of loci. Simulation results indicate that the dependency of non-additive genetic variance estimates on locus number persisted in all implementation strategies we investigated. However, this dependency was considerably diminished with normal priors for genetic effects as compared with uniform priors (constant or variable across loci). Descent graph sampling of genotypes modestly improved variance components estimation compared with Gibbs sampling. Moreover, a larger pedigree produced considerably better variance components estimation, suggesting this dependency might originate from data insufficiency. As the FLM represents an appealing alternative to the infinitesimal model for genetic parameter estimation and for inclusion of polygenic background variation in QTL mapping analyses, further improvements are warranted and might be achieved via improvement of the sampler or treatment of the number of loci as an unknown.  相似文献   

11.
Dupuis JA  Schwarz CJ 《Biometrics》2007,63(4):1015-1022
This article considers a Bayesian approach to the multistate extension of the Jolly-Seber model commonly used to estimate population abundance in capture-recapture studies. It extends the work of George and Robert (1992, Biometrika79, 677-683), which dealt with the Bayesian estimation of a closed population with only a single state for all animals. A super-population is introduced to model new entrants in the population. Bayesian estimates of abundance are obtained by implementing a Gibbs sampling algorithm based on data augmentation of the missing data in the capture histories when the state of the animal is unknown. Moreover, a partitioning of the missing data is adopted to ensure the convergence of the Gibbs sampling algorithm even in the presence of impossible transitions between some states. Lastly, we apply our methodology to a population of fish to estimate abundance and movement.  相似文献   

12.
Estimation of Allele Frequencies at Isoloci   总被引:3,自引:0,他引:3  
R. S. Waples 《Genetics》1988,118(2):371-384
In some polyploid animals and plants, pairs of duplicated loci occur that share alleles encoding proteins with identical electrophoretic mobilities. Except in cases where these ``isoloci' are known to be inherited tetrasomically, individual genotypes cannot be determined unambiguously, and there is no direct way to assign observed variation to a particular locus of the pair. For a pair of diallelic isoloci, nine genotypes are possible but only five phenotypes can be identified, corresponding to individuals with 0-4 doses of the variant allele. A maximum likelihood (ML) approach is used here to identify the set of allele frequencies (p, q) at the individual gene loci with the highest probability of producing the observed phenotypic distribution. A likelihood ratio test is used to generate the asymmetrical confidence intervals around ML estimates. Simulations indicate that the standard error of p is typically about twice the binomial sampling error associated with single locus allele frequency estimates. ML estimates can be used in standard indices of genetic diversity and differentiation and in goodness-of-fit tests of genetic hypotheses. The noncentral χ(2) distribution is used to evaluate the power of a test of apparent heterozygote deficiency that results from attributing all variation to one locus when both loci are polymorphic.  相似文献   

13.
We used empirical data from a laboratory population of Heterandria formosa (Pisces, Poeciliidae) and simulations based on those data to examine methods of estimating the intensity of phenotypic selection through fertility differences when generations overlap. The correct intensity was known because we knew the history of every female, as well as the rate at which the population was growing. The net reproductive rate, which is an appropriate measure of fitness when generations do not overlap, underestimated the true intensity of selection. This measure of fitness did not provide a rank order of individuals that agreed significantly with the correct rank order. Simulated schemes of vertical sampling of females, using number of offspring produced during the sampling window as a measure of fitness, provided no consistently useful information about the direction or intensity of selection. Weighting the number of offspring by female age, to give more value to offspring of younger females, produced only a slight improvement in accuracy. Other data indicated that fertility differentials are closely tied to particular environmental conditions. When generations overlap, estimates of fertility differentials will have to be based on horizontal studies of lifetime performance of a cohort coupled with demographic information on the entire population.  相似文献   

14.
Genetic relatedness is a vital parameter in the evolution of social behaviour by kin selection. It can be easily estimated using genetic markers and calculating the genotypic correlation or regression of group members. Spatial gene frequency differentiation, due to population subdivision or isolation by distance, boosts the relatedness estimates. In such cases it may be useful to partition the estimate into components, the operational relatedness is normally that among individuals in social groups within the same subpopulation. Although it is straightforward to estimate the average relatedness in social groups, estimating values for specific individuals with the help of genetic markers is still problematic. Current estimators tend to give biased values and the sampling error is large. In spite of these shortcomings, studies of social behaviour combining relatedness and reproductive success are sorely needed.  相似文献   

15.
Mäki K  Janss LL  Groen AF  Liinamo AE  Ojala M 《Heredity》2004,92(5):402-408
The aim of the study was to assess the possible existence of major genes influencing hip and elbow dysplasia in four dog populations. A Bayesian segregation analysis was performed separately on each population. In total, 34 140 dogs were included in the data set. Data were analysed with both a polygenic and a mixed inheritance model. Polygenic models included fixed and random environmental effects and additive genetic effects. To apply mixed inheritance models, the effect of a major gene was added to the polygenic models. The major gene was modelled as an autosomal biallelic locus with Mendelian transmission probabilities. Gibbs sampling and a Monte Carlo Markov Chain algorithm were used. The goodness-of-fit of the different models were compared using the residual sum-of-squares. The existence of a major gene was considered likely for hip dysplasia in all the breeds and for elbow dysplasia in one breed. Several procedures were followed to exclude the possible false detection of major genes based on non-normality of data: permuted datasets were analysed, data-transformations were applied, and residuals were judged for normality. Allelic effects at the major gene locus showed nearly to complete dominance, with a recessive, unfavourable allele in both traits. Relatively high estimates of the frequencies of unfavourable alleles in each breed suggest that considerable genetic progress would be possible by selection against major genes. However, the major genes that are possibly affecting hip and elbow dysplasia in these populations will require further study.  相似文献   

16.
Although Chinese hamster ovary (CHO) cells, with their unique characteristics, have become a major workhorse for the manufacture of therapeutic recombinant proteins, one of the major challenges in CHO cell line generation (CLG) is how to efficiently identify those rare, high‐producing clones among a large population of low‐ and non‐productive clones. It is not unusual that several hundred individual clones need to be screened for the identification of a commercial clonal cell line with acceptable productivity and growth profile making the cell line appropriate for commercial application. This inefficiency makes the process of CLG both time consuming and laborious. Currently, there are two main CHO expression systems, dihydrofolate reductase (DHFR)‐based methotrexate (MTX) selection and glutamine synthetase (GS)‐based methionine sulfoximine (MSX) selection, that have been in wide industrial use. Since selection of recombinant cell lines in the GS‐CHO system is based on the balance between the expression of the GS gene introduced by the expression plasmid and the addition of the GS inhibitor, L‐MSX, the expression of GS from the endogenous GS gene in parental CHOK1SV cells will likely interfere with the selection process. To study endogenous GS expression's potential impact on selection efficiency, GS‐knockout CHOK1SV cell lines were generated using the zinc finger nuclease (ZFN) technology designed to specifically target the endogenous CHO GS gene. The high efficiency (~2%) of bi‐allelic modification on the CHO GS gene supports the unique advantages of the ZFN technology, especially in CHO cells. GS enzyme function disruption was confirmed by the observation of glutamine‐dependent growth of all GS‐knockout cell lines. Full evaluation of the GS‐knockout cell lines in a standard industrial cell culture process was performed. Bulk culture productivity improved two‐ to three‐fold through the use of GS‐knockout cells as parent cells. The selection stringency was significantly increased, as indicated by the large reduction of non‐producing and low‐producing cells after 25 µM L‐MSX selection, and resulted in a six‐fold efficiency improvement in identifying similar numbers of high‐productive cell lines for a given recombinant monoclonal antibody. The potential impact of GS‐knockout cells on recombinant protein quality is also discussed. Biotechnol. Bioeng. 2012; 109:1007–1015. © 2011 Wiley Periodicals, Inc.  相似文献   

17.
Morris and Spieth (1978) described a method of calculating unbiased estimates of diploid genotype frequencies given information on the genotypes of haploid cells derived from diploid individuals. They concluded that three haploids per diploid would minimize sampling variance of genotype frequencies, given a fixed total number of haploids examined. If the identity of individual diploid genotypes is needed, Morris and Spieth (1978) stated that more haploids should be collected per diploid. We extend this work by showing from a Bayesian perspective that the probability of misclassification of individuals depends not only on the number of haploids sampled, but also on the genetic structure of the population since misclassification error will increase as the frequency of heterozygotes increases. Since information on the genetic structure (allele frequencies, inbreeding coefficient) of a population is rarely known prior to the initiation of an empirical study, the usefulness of our Bayesian approach is in experimental design, by revealing the magnitude of possible misclassification errors given a particular choice of number of haploids.  相似文献   

18.
Quality control filtering of single-nucleotide polymorphisms (SNPs) is a key step when analyzing genomic data. Here we present a practical method to identify low-quality SNPs, meaning markers whose genotypes are wrongly assigned for a large proportion of individuals, by estimating the heritability of gene content at each marker, where gene content is the number of copies of a particular reference allele in a genotype of an animal (0, 1, or 2). If there is no mutation at the marker, gene content has an additive heritability of 1 by construction. The method uses restricted maximum likelihood (REML) to estimate heritability of gene content at each SNP and also builds a likelihood-ratio test statistic to test for zero error variance in genotyping. As a by-product, estimates of the allele frequencies of markers at the base population are obtained. Using simulated data with 10% permutation error (4% actual error) in genotyping, the method had a specificity of 0.96 (4% of correct markers are rejected) and a sensitivity of 0.99 (1% of wrong markers are accepted) if markers with heritability lower than 0.975 are discarded. Checking of Mendelian errors resulted in a lower sensitivity (0.84) for the same simulation. The proposed method is further illustrated with a real data set with genotypes from 3534 animals genotyped for 50,433 markers from the Illumina PorcineSNP60 chip and a pedigree of 6473 individuals; those markers underwent very little quality control. A total of 4099 markers with P-values lower than 0.01 were discarded based on our method, with associated estimates of heritability as low as 0.12. Contrary to other techniques, our method uses all information in the population simultaneously, can be used in any population with markers and pedigree recordings, and is simple to implement using standard software for REML estimation. Scripts for its use are provided.  相似文献   

19.
A Method of Screening for Genes of Major Effect   总被引:1,自引:1,他引:0       下载免费PDF全文
B. P. Kinghorn  B. W. Kennedy    C. Smith 《Genetics》1993,134(1):351-360
This paper describes a method for screening animal populations on an index of calculated probabilities of genotype status at an unknown single locus. Animals selected by such a method might then be candidates in test matings and genetic marker analyses for major gene detection. The method relies on phenotypic measures for a continuous trait plus identification of sire and dam. Some missing phenotypes and missing pedigree information are permitted. The method is an iterative two-step procedure, the first step estimates genotype probabilities and the second step estimates genotypic effects by regressing phenotypes on genotype probabilities, modeled as true genotype status plus error. Prior knowledge or choice of major locus-free heritability for the trait of interest is required, plus initial starting estimates of the effect on phenotype of carrying one and two copies of the unknown gene. Gene frequency can be estimated by this method, but it is demonstrated that the consequences of using an incorrect fixed prior for gene frequency are not particularly adverse where true frequency of the allele with major effect is low. Simulations involving deterministic sampling from the normal distribution lead to convergence for estimates of genotype effects at the true values, for a reasonable range of starting values, illustrating that estimation of major gene effects has a rational basis. In the absence of polygenic effects, stochastic simulations of 600 animals in five generations resulted in estimates of genotypic effects close to the true values. However, stochastic simulations involving generation and fitting of both major genotype and animal polygenic effects showed upward bias in estimates of major genotype effects. This can be partially overcome by not using information from relatives when calculating genotype probabilities-a result which suggests a route to a modified method which is unbiased and yet does use this information.  相似文献   

20.
The analysis of population survey data on DNA sequence variation   总被引:27,自引:2,他引:25  
A technique is presented for the partitioning of nucleotide diversity into within- and between-population components for the case in which multiple populations have been surveyed for restriction-site variation. This allows the estimation of an analogue of FST at the DNA level. Approximate expressions are given for the variance of these estimates resulting from nucleotide, individual, and population sampling. Application of the technique to existing studies on mitochondrial DNA in several animal species and on several nuclear genes in Drosophila indicates that the standard errors of genetic diversity estimates are usually quite large. Thus, comparative studies of nucleotide diversity need to be substantially larger than the current standards. Normally, only a very small fraction of the sampling variance is caused by sampling of individuals. Even when 20 or so restriction enzymes are employed, nucleotide sampling is a major source of error, and population sampling is often quite important. Generally, the degree of population subdivision at the nucleotide level is comparable with that at the haplotype level, but significant differences do arise as a result of inequalities in the genetic distances between haplotypes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号