首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When clustered multinomial responses are fit using the generalized logistic link, Morel (1989) introduced a small sample correction in the Taylor series based estimator of the covariance matrix of the parameter estimates. The correction reduces the bias of the Type I error rates in small samples and guarantees positive definiteness of the estimated variance‐covariance matrix. It is well known that small sample bias in the use of the Delta method persists in any application of the Generalized Estimating Equations (GEE) methodology. In this article, we extend the correction originally suggested for the generalized logistic link, to other link functions and distributions, when parameters are estimated by GEE. In a Monte Carlo study with correlated data generated under different sampling schemes, the small sample correction has been shown to be effective in reducing the Type I error rates when the number of clusters is relatively small.  相似文献   

2.
Genomic selection has become increasingly important in the breeding of animals and plants. The response variable is an important factor, influencing the accuracy of genomic selection. The de-regressed proof (DRP) based on traditional estimated breeding value (EBV) is commonly used as response variable. In the current study, simulated data from 16th QTL-MAS Workshop and real data from Chinese Holstein cattle were used to compare accuracy and bias of genomic prediction with two methods of calculating DRP. Our results with simulated data showed that the correlation between genomic EBV and true breeding value achieved using the Jairath method (DRP_J) was superior to that achieved using the Garrick method (DRP_G) for simulated trait 1 but the reverse was true for simulated trait 3, and these two methods performed comparably for simulated trait 2. For all three simulated traits, DRP_J yielded larger bias of genomic prediction. However, DRP_J outperformed DRP_G in both accuracy and unbiasedness for four milk production traits in Chinese Holstein. In the estimation of genomic breeding value using genomic BLUP model, two methods for weighting diagonal elements of incidence matrix associated with residual error were also compared. With increasing the proportion of genetic variance unexplained by markers, the accuracy of genomic prediction was decreased and the bias was increased. Weighting by the reliability of DRP produced accuracy comparable to the evaluation where the proportion of genetic variance unexplained by markers was considered, but with smaller bias in general.  相似文献   

3.
A Markov chain Monte Carlo (MCMC) algorithm to sample an exchangeable covariance matrix, such as the one of the error terms (R0) in a multiple trait animal model with missing records under normal-inverted Wishart priors is presented. The algorithm (FCG) is based on a conjugate form of the inverted Wishart density that avoids sampling the missing error terms. Normal prior densities are assumed for the ''fixed'' effects and breeding values, whereas the covariance matrices are assumed to follow inverted Wishart distributions. The inverted Wishart prior for the environmental covariance matrix is a product density of all patterns of missing data. The resulting MCMC scheme eliminates the correlation between the sampled missing residuals and the sampled R0, which in turn has the effect of decreasing the total amount of samples needed to reach convergence. The use of the FCG algorithm in a multiple trait data set with an extreme pattern of missing records produced a dramatic reduction in the size of the autocorrelations among samples for all lags from 1 to 50, and this increased the effective sample size from 2.5 to 7 times and reduced the number of samples needed to attain convergence, when compared with the ''data augmentation'' algorithm.  相似文献   

4.
We explore the estimation of uncertainty in evolutionary parameters using a recently devised approach for resampling entire additive genetic variance–covariance matrices ( G ). Large‐sample theory shows that maximum‐likelihood estimates (including restricted maximum likelihood, REML) asymptotically have a multivariate normal distribution, with covariance matrix derived from the inverse of the information matrix, and mean equal to the estimated G . This suggests that sampling estimates of G from this distribution can be used to assess the variability of estimates of G , and of functions of G . We refer to this as the REML‐MVN method. This has been implemented in the mixed‐model program WOMBAT. Estimates of sampling variances from REML‐MVN were compared to those from the parametric bootstrap and from a Bayesian Markov chain Monte Carlo (MCMC) approach (implemented in the R package MCMCglmm). We apply each approach to evolvability statistics previously estimated for a large, 20‐dimensional data set for Drosophila wings. REML‐MVN and MCMC sampling variances are close to those estimated with the parametric bootstrap. Both slightly underestimate the error in the best‐estimated aspects of the G matrix. REML analysis supports the previous conclusion that the G matrix for this population is full rank. REML‐MVN is computationally very efficient, making it an attractive alternative to both data resampling and MCMC approaches to assessing confidence in parameters of evolutionary interest.  相似文献   

5.
A genetic model for modified diallel crosses is proposed for estimating variance and covariance components of cytoplasmic, maternal additive and dominance effects, as well as direct additive and dominance effects. Monte Carlo simulations were conducted to compare the efficiencies of minimum norm quadratic unbiased estimation (MINQUE) methods. For both balanced and unbalanced mating designs, MINQUE (0/1), which has 0 for all the prior covariances and 1 for all the prior variances, has similar efficiency to MINQUE(), which has parameter values for the prior values. Unbiased estimates of variance and covariance components and their sampling variances could be obtained with MINQUE(0/1) and jackknifing. A t-test following jackknifing is applicable to test hypotheses for zero variance and covariance components. The genetic model is robust for estimating variance and covariance components under several situations of no specific effects. A MINQUE(0/1) procedure is suggested for unbiased estimation of covariance components between two traits with equal design matrices. Methods of unbiased prediction for random genetic effects are discussed. A linear unbiased prediction (LUP) method is shown to be efficient for the genetic model. An example is given for a demonstration of estimating variance and covariance components and predicting genetic effects.  相似文献   

6.
Diallel analysis for sex-linked and maternal effects   总被引:40,自引:0,他引:40  
Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.  相似文献   

7.
根据朱军(1996)提出的包括基因型×环境互作的胚乳品质性状三倍体遗传模型,运用蒙特卡罗模拟证明,以混合线性模型统计分析的MINQUE法,对非等试验设计获得的实验数据进行数量遗传分析是可行的.蒙特卡罗模拟结果表明在样本群体大小基本一致的条件下,采用相等试验设计或非等试验设计所估算的遗传参数的偏差(Blas)和功效值(Power)没有明显差异,表明以非等试验设计获得的非平衡数据也可用来进行遗传分析,估算上述遗传模型中的各项遗传方差分量和协方差分量,并且可以采用朱军(1993)提出的AUP法来预测遗传模型中的各项遗传效应值.  相似文献   

8.
Prediction in mixed linear models by Henderson 's (1972) BLUP (Best Linear Unbiased Prediction) requires knowledge of the underlying variance/covariance components to have the property ‘best’. In breeding value prediction these parameters are not known, generally. They have to be replaced by estimations and BLUP becomes estimated BLUP (EBLUP). The aim of this investigation was the evaluation of EBLUP with help of a designed simulation experiment. Criteria used for the evaluation were the mean squared error (MSE) and the (genetic) selection differential (GSD). Besides, an idea of the overestimation of the accuracy of EBLUP by the naive MSE approximation based on the MSE formulas of BLUP with variance component estimations instead of unknown parameters is given.  相似文献   

9.
Simulated data were used to determine the properties of multivariate prediction of breeding values for categorical and continuous traits using phenotypic, molecular genetic and pedigree information by mixed linear-threshold animal models via Gibbs sampling. Simulation parameters were chosen such that the data resembled situations encountered in Warmblood horse populations. Genetic evaluation was performed in the context of the radiographic findings in the equine limbs. The simulated pedigree comprised seven generations and 40 000 animals per generation. The simulated data included additive genetic values, residuals and fixed effects for one continuous trait and liabilities of four binary traits. For one of the binary traits, quantitative trait locus (QTL) effects and genetic markers were simulated, with three different scenarios with respect to recombination rate (r) between genetic markers and QTL and polymorphism information content (PIC) of genetic markers being studied: r = 0.00 and PIC = 0.90 (r0p9), r = 0.01 and PIC = 0.90 (r1p9), and r = 0.00 and PIC = 0.70 (r0p7). For each scenario, 10 replicates were sampled from the simulated horse population, and six different data sets were generated per replicate. Data sets differed in number and distribution of animals with trait records and the availability of genetic marker information. Breeding values were predicted via Gibbs sampling using a Bayesian mixed linear-threshold animal model with residual covariances fixed to zero and a proper prior for the genetic covariance matrix. Relative breeding values were used to investigate expected response to multi- and single-trait selection. In the sires with 10 or more offspring with trait information, correlations between true and predicted breeding values ranged between 0.89 and 0.94 for the continuous traits and between 0.39 and 0.77 for the binary traits. Proportions of successful identification of sires of average, favourable and unfavourable genetic value were 81% to 86% for the continuous trait and 57% to 74% for the binary traits in these sires. Expected decrease of prevalence of the QTL trait was 3% to 12% after multi-trait selection for all binary traits and 9% to 17% after single-trait selection for the QTL trait. The combined use of phenotype and genotype data was superior to the use of phenotype data alone. It was concluded that information on phenotypes and highly informative genetic markers should be used for prediction of breeding values in mixed linear-threshold animal models via Gibbs sampling to achieve maximum reduction in prevalences of binary traits.  相似文献   

10.
Schoen DJ  Clegg MT 《Genetics》1986,112(4):927-945
Estimation of mating system parameters in plant populations typically employs family-structured samples of progeny genotypes. These estimation models postulate a mixture of self-fertilization and random outcrossing. One assumption of such models concerns the distribution of pollen genotypes among eggs within single maternal families. Previous applications of the mixed mating model to mating system estimation have assumed that pollen genotypes are sampled randomly from the total population in forming outcrossed progeny within families. In contrast, the one-pollen parent model assumes that outcrossed progeny within a family share a single-pollen parent genotype. Monte Carlo simulations of family-structured sampling were carried out to examine the consequences of violations of the different assumptions of the two models regarding the distribution of pollen genotypes among eggs. When these assumptions are violated, estimates of mating system parameters may be significantly different from their true values and may exhibit distributions which depart from normality. Monte Carlo methods were also used to examine the utility of the bootstrap resampling algorithm for estimating the variances of mating system parameters. The bootstrap method gives variance estimates that approximate empirically determined values. When applied to data from two plant populations which differ in pollen genotype distributions within families, the two estimation procedures exhibit the same behavior as that seen with the simulated data.  相似文献   

11.
Monte Carlo importance sampling for evaluating numerical integrationis discussed. We consider a parametric family of sampling distributionsand propose the use of the sampling distribution estimated bymaximum likelihood. The proposed method of importance samplingusing the estimated sampling distribution is shown to improvethe asymptotic variance of the ordinary method using the truesampling distribution. The argument is closely related to thediscussion of the paradox in Henmi & Eguchi (2004). We focuson a condition under which the estimated integration value obtainedby the proposed method has asymptotic zero variance.  相似文献   

12.
Genomic selection uses genome-wide dense SNP marker genotyping for the prediction of genetic values, and consists of two steps: (1) estimation of SNP effects, and (2) prediction of genetic value based on SNP genotypes and estimates of their effects. For the former step, BayesB type of estimators have been proposed, which assume a priori that many markers have no effects, and some have an effect coming from a gamma or exponential distribution, i.e. a fat-tailed distribution. Whilst such estimators have been developed using Monte Carlo Markov chain (MCMC), here we derive a much faster non-MCMC based estimator by analytically performing the required integrations. The accuracy of the genome-wide breeding value estimates was 0.011 (s.e. 0.005) lower than that of the MCMC based BayesB predictor, which may be because the integrations were performed one-by-one instead of for all SNPs simultaneously. The bias of the new method was opposite to that of the MCMC based BayesB, in that the new method underestimates the breeding values of the best selection candidates, whereas MCMC-BayesB overestimated their breeding values. The new method was computationally several orders of magnitude faster than MCMC based BayesB, which will mainly be advantageous in computer simulations of entire breeding schemes, in cross-validation testing, and practical schemes with frequent re-estimation of breeding values.  相似文献   

13.
J D Knoke 《Biometrics》1991,47(2):523-533
Change from baseline to a follow-up examination can be compared among two or more randomly assigned treatment groups by using analysis of variance on the change scores. However, a generally more sensitive (powerful) test can be performed using analysis of covariance (ANOVA) on the follow-up data with the baseline data as a covariate. This approach is not without potential problems, though. The assumption of ordinary ANCOVA of normally distributed errors is speculative for many variables employed in biomedical research. Furthermore, the baseline values are inevitably random variables and often are measured with error. This report investigates, in this situation, the validity and relative power of the ordinary ANCOVA test and two asymptotically distribution-free alternative tests, one based on the rank transformation and the other based on the normal scores transformation. The procedures are illustrated with data from a clinical trial. Normal and several nonnormal distributions, as well as varying degree of variable error, are studied by Monte Carlo methods. The normal scores test is generally recommended for statistical practice.  相似文献   

14.
基于观测数据的陆地生态系统模型参数估计有助于提高模型的模拟和预测能力,降低模拟不确定性.在已有参数估计研究中,涡度相关技术测定的净生态系统碳交换量(NEE)数据的随机误差通常被假设为服从零均值的正态分布.然而近年来已有研究表明NEE数据的随机误差更服从双指数分布.为探讨NEE观测误差分布类型的不同选择对陆地生态系统机理模型参数估计以及碳通量模拟结果造成的差异,以长白山温带阔叶红松林为研究区域,采用马尔可夫链-蒙特卡罗方法,利用2003~2005年测定的NEE数据对陆地生态系统机理模型CEVSA2的敏感参数进行估计,对比分析了两种误差分布类型(正态分布和双指数分布)的参数估计结果以及碳通量模拟的差异.结果表明,基于正态观测误差模拟的总初级生产力和生态系统呼吸的年总量分别比基于双指数观测误差的模拟结果高61~86 g C m-2 a-1和107~116 g C m-2 a-1,导致前者模拟的NEE年总量较后者低29~47 g C m-2 a-1,特别在生长旺季期间有明显低估.在参数估计研究中,不能忽略观测误差的分布类型以及相应的目标函数的选择,它们的不合理设置可能对参数估计以及模拟结果产生较大影响.  相似文献   

15.
A method is proposed to infer genetic parameters within a cohort, using data from all individuals in an experiment. An application is the study of changes in additive genetic variance over generations, employing data from all generations. Inferences about the genetic variance in a given generation are based on its marginal posterior distribution, estimated via Markov chain Monte Carlo methods. As defined, the additive genetic variance within the group is directly related to the amount of selection response to be expected if parents are chosen within the group. Results from a simulated selection experiment are used to illustrate properties of the method. Four sets of data are analysed: directional selection with and without environmental trend, and random selection, with and without environmental trend. In all cases, posterior credibility intervals of size 95% assign relatively high density to values of the additive genetic variance and heritability in the neighbourhood of the true values. Properties and generalizations of the method are discussed.  相似文献   

16.
Bayesian adaptive Markov chain Monte Carlo estimation of genetic parameters   总被引:2,自引:0,他引:2  
Accurate and fast estimation of genetic parameters that underlie quantitative traits using mixed linear models with additive and dominance effects is of great importance in both natural and breeding populations. Here, we propose a new fast adaptive Markov chain Monte Carlo (MCMC) sampling algorithm for the estimation of genetic parameters in the linear mixed model with several random effects. In the learning phase of our algorithm, we use the hybrid Gibbs sampler to learn the covariance structure of the variance components. In the second phase of the algorithm, we use this covariance structure to formulate an effective proposal distribution for a Metropolis-Hastings algorithm, which uses a likelihood function in which the random effects have been integrated out. Compared with the hybrid Gibbs sampler, the new algorithm had better mixing properties and was approximately twice as fast to run. Our new algorithm was able to detect different modes in the posterior distribution. In addition, the posterior mode estimates from the adaptive MCMC method were close to the REML (residual maximum likelihood) estimates. Moreover, our exponential prior for inverse variance components was vague and enabled the estimated mode of the posterior variance to be practically zero, which was in agreement with the support from the likelihood (in the case of no dominance). The method performance is illustrated using simulated data sets with replicates and field data in barley.  相似文献   

17.
This study considers the effects of sample size on estimates of three parasitological indices (prevalence, mean abundance and mean intensity) in four different host–parasite systems, each showing a different pattern of infection. Monte Carlo simulation procedures were used in order to obtain an estimation of the parasitological indices, as well as their variance and bias, based on samples of different size. Although results showed that mean values of all indices were similar irrespective of sample size, estimates of prevalence were not significantly affected by sample size whereas mean abundance and mean intensity were affected in at least one sample. Underestimation of values was more perceptible in small (<40) sample sizes. Distribution of the estimated values revealed a different arrangement according to the host–parasite system and to the parasitological parameter. Monte Carlo simulation procedures are, therefore, suggested to be included in studies concerning estimation of parasitological parameters.  相似文献   

18.
Estimation of quantitative genetic parameters conventionally requires known pedigree structure. However, several methods have recently been developed to circumvent this requirement by inferring relationship structure from molecular marker data. Here, two such marker-assisted methodologies were used and compared in an aquaculture population of rainbow trout (Oncorhynchus mykiss). Firstly a regression-based model employing estimates of pairwise relatedness was applied, and secondly a Markov Chain Monte Carlo (MCMC) procedure was employed to reconstruct full-sibships and hence an explicit pedigree. While both methods were effective in detecting significant components of genetic variance and covariance for size and spawning time traits, the regression model resulted in estimates that were quantitatively unreliable, having both significant bias and low precision. This result can be largely attributed to poor performance of the pairwise relatedness estimator. In contrast, genetic parameters estimated from the reconstructed pedigree showed close agreement with ideal values obtained from the true pedigree. Although not significantly biased, parameters based on the reconstructed pedigree were underestimated relative to ideal values. This was due to the complex structure of the true pedigree in which high numbers of half-sibling relationships resulted in inaccurate partitioning of full-sibships, and additional unrecognized relatedness between families.  相似文献   

19.

Background

Genomic best linear unbiased prediction (GBLUP) is a statistical method used to predict breeding values using single nucleotide polymorphisms for selection in animal and plant breeding. Genetic effects are often modeled as additively acting marker allele effects. However, the actual mode of biological action can differ from this assumption. Many livestock traits exhibit genomic imprinting, which may substantially contribute to the total genetic variation of quantitative traits. Here, we present two statistical models of GBLUP including imprinting effects (GBLUP-I) on the basis of genotypic values (GBLUP-I1) and gametic values (GBLUP-I2). The performance of these models for the estimation of variance components and prediction of genetic values across a range of genetic variations was evaluated in simulations.

Results

Estimates of total genetic variances and residual variances with GBLUP-I1 and GBLUP-I2 were close to the true values and the regression coefficients of total genetic values on their estimates were close to 1. Accuracies of estimated total genetic values in both GBLUP-I methods increased with increasing degree of imprinting and broad-sense heritability. When the imprinting variances were equal to 1.4% to 6.0% of the phenotypic variances, the accuracies of estimated total genetic values with GBLUP-I1 exceeded those with GBLUP by 1.4% to 7.8%. In comparison with GBLUP-I1, the superiority of GBLUP-I2 over GBLUP depended strongly on degree of imprinting and difference in genetic values between paternal and maternal alleles. When paternal and maternal alleles were predicted (phasing accuracy was equal to 0.979), accuracies of the estimated total genetic values in GBLUP-I1 and GBLUP-I2 were 1.7% and 1.2% lower than when paternal and maternal alleles were known.

Conclusions

This simulation study shows that GBLUP-I1 and GBLUP-I2 can accurately estimate total genetic variance and perform well for the prediction of total genetic values. GBLUP-I1 is preferred for genomic evaluation, while GBLUP-I2 is preferred when the imprinting effects are large, and the genetic effects differ substantially between sexes.  相似文献   

20.
We introduce a new method, moment reconstruction, of correcting for measurement error in covariates in regression models. The central idea is similar to regression calibration in that the values of the covariates that are measured with error are replaced by "adjusted" values. In regression calibration the adjusted value is the expectation of the true value conditional on the measured value. In moment reconstruction the adjusted value is the variance-preserving empirical Bayes estimate of the true value conditional on the outcome variable. The adjusted values thereby have the same first two moments and the same covariance with the outcome variable as the unobserved "true" covariate values. We show that moment reconstruction is equivalent to regression calibration in the case of linear regression, but leads to different results for logistic regression. For case-control studies with logistic regression and covariates that are normally distributed within cases and controls, we show that the resulting estimates of the regression coefficients are consistent. In simulations we demonstrate that for logistic regression, moment reconstruction carries less bias than regression calibration, and for case-control studies is superior in mean-square error to the standard regression calibration approach. Finally, we give an example of the use of moment reconstruction in linear discriminant analysis and a nonstandard problem where we wish to adjust a classification tree for measurement error in the explanatory variables.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号