首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
On estimating the heterozygosity and polymorphism information content value   总被引:1,自引:0,他引:1  
The polymorphism information content (PIC) value is commonly used in genetics as a measure of polymorphism for a marker locus used in linkage analysis. In this communication we have derived the uniformly minimum variance unbiased estimator of PIC along with its exact variance. We have also calculated the exact variance of the maximum likelihood estimator of PIC which is asymptotically an unbiased estimator. In order to find this variance we have derived a recursive formula to calculate the moments of any polynomial in a set of variables that are multinomially distributed.  相似文献   

2.
Lou XY  Yang MC 《Genetica》2006,128(1-3):471-484
A genetic model is developed with additive and dominance effects of a single gene and polygenes as well as general and specific reciprocal effects for the progeny from a diallel mating design. The methods of ANOVA, minimum norm quadratic unbiased estimation (MINQUE), restricted maximum likelihood estimation (REML), and maximum likelihood estimation (ML) are suggested for estimating variance components, and the methods of generalized least squares (GLS) and ordinary least squares (OLS) for fixed effects, while best linear unbiased prediction, linear unbiased prediction (LUP), and adjusted unbiased prediction are suggested for analyzing random effects. Monte Carlo simulations were conducted to evaluate the unbiasedness and efficiency of statistical methods involving two diallel designs with commonly used sample sizes, 6 and 8 parents, with no and missing crosses, respectively. Simulation results show that GLS and OLS are almost equally efficient for estimation of fixed effects, while MINQUE (1) and REML are better estimators of the variance components and LUP is most practical method for prediction of random effects. Data from a Drosophila melanogaster experiment (Gilbert 1985a, Theor appl Genet 69:625–629) were used as a working example to demonstrate the statistical analysis. The new methodology is also applicable to screening candidate gene(s) and to other mating designs with multiple parents, such as nested (NC Design I) and factorial (NC Design II) designs. Moreover, this methodology can serve as a guide to develop new methods for detecting indiscernible major genes and mapping quantitative trait loci based on mixture distribution theory. The computer program for the methods suggested in this article is freely available from the authors.  相似文献   

3.
Diallel analysis for sex-linked and maternal effects   总被引:40,自引:0,他引:40  
Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.  相似文献   

4.
When the underlying disease is rare, to control the coefficient of variation for the sample proportion of cases, we may wish to apply inverse sampling. In this paper, we derive the uniformly minimum variance unbiased estimator (UMVUE) of relative risk and its variance in closed form under inverse sampling. On the basis of a Monte Carlo simulation, we demonstrate that using the UMVUE of relative risk can substantially reduce the mean-squared-error of using the maximum likelihood estimator, especially when the number of index cases in both comparison samples is small. For a given fixed total cost, we include a program that can be used to find the optimal allocation for the number of index cases to minimize the variance of the UMVUE as well.  相似文献   

5.
MIXED MODEL APPROACHES FOR ESTIMATING GENETIC VARIANCES AND COVARIANCES   总被引:62,自引:4,他引:58  
The limitations of methods for analysis of variance(ANOVA)in estimating genetic variances are discussed. Among the three methods(maximum likelihood ML, restricted maximum likelihood REML, and minimum norm quadratic unbiased estimation MINQUE)for mixed linear models, MINQUE method is presented with formulae for estimating variance components and covariances components and for predicting genetic effects. Several genetic models, which cannot be appropriately analyzed by ANOVA methods, are introduced in forms of mixed linear models. Genetic models with independent random effects can be analyzed by MINQUE(1)method whieh is a MINQUE method with all prior values setting 1. MINQUE(1)method can give unbiased estimation for variance components and covariance components, and linear unbiased prediction (LUP) for genetic effects. There are more complicate genetic models for plant seeds which involve correlated random effects. MINQUE(0/1)method, which is a MINQUE method with all prior covariances setting 0 and all prior variances setting 1, is suitable for estimating variance and covariance components in these models. Mixed model approaches have advantage over ANOVA methods for the capacity of analyzing unbalanced data and complicated models. Some problems about estimation and hypothesis test by MINQUE method are discussed.  相似文献   

6.
When the sample size is not large or when the underlying disease is rare, to assure collection of an appropriate number of cases and to control the relative error of estimation, one may employ inverse sampling, in which one continues sampling subjects until one obtains exactly the desired number of cases. This paper focuses discussion on interval estimation of the simple difference between two proportions under independent inverse sampling. This paper develops three asymptotic interval estimators on the basis of the maximum likelihood estimator (MLE), the uniformly minimum variance unbiased estimator (UMVUE), and the asymptotic likelihood ratio test (ALRT). To compare the performance of these three estimators, this paper calculates the coverage probability and the expected length of the resulting confidence intervals on the basis of the exact distribution. This paper finds that when the underlying proportions of cases in both two comparison populations are small or moderate (≤0.20), all three asymptotic interval estimators developed here perform reasonably well even for the pre-determined number of cases as small as 5. When the pre-determined number of cases is moderate or large (≥50), all three estimators are essentially equivalent in all the situations considered here. Because application of the two interval estimators derived from the MLE and the UMVUE does not involve any numerical iterative procedure needed in the ALRT, for simplicity we may use these two estimators without losing efficiency.  相似文献   

7.
Molecular marker data collected from natural populations allows information on genetic relationships to be established without referencing an exact pedigree. Numerous methods have been developed to exploit the marker data. These fall into two main categories: method of moment estimators and likelihood estimators. Method of moment estimators are essentially unbiased, but utilise weighting schemes that are only optimal if the analysed pair is unrelated. Thus, they differ in their efficiency at estimating parameters for different relationship categories. Likelihood estimators show smaller mean squared errors but are much more biased. Both types of estimator have been used in variance component analysis to estimate heritability. All marker-based heritability estimators require that adequate levels of the true relationship be present in the population of interest and that adequate amounts of informative marker data are available. I review the different approaches to relationship estimation, with particular attention to optimizing the use of this relationship information in subsequent variance component estimation.  相似文献   

8.
The fate of scientific hypotheses often relies on the ability of a computational model to explain the data, quantified in modern statistical approaches by the likelihood function. The log-likelihood is the key element for parameter estimation and model evaluation. However, the log-likelihood of complex models in fields such as computational biology and neuroscience is often intractable to compute analytically or numerically. In those cases, researchers can often only estimate the log-likelihood by comparing observed data with synthetic observations generated by model simulations. Standard techniques to approximate the likelihood via simulation either use summary statistics of the data or are at risk of producing substantial biases in the estimate. Here, we explore another method, inverse binomial sampling (IBS), which can estimate the log-likelihood of an entire data set efficiently and without bias. For each observation, IBS draws samples from the simulator model until one matches the observation. The log-likelihood estimate is then a function of the number of samples drawn. The variance of this estimator is uniformly bounded, achieves the minimum variance for an unbiased estimator, and we can compute calibrated estimates of the variance. We provide theoretical arguments in favor of IBS and an empirical assessment of the method for maximum-likelihood estimation with simulation-based models. As case studies, we take three model-fitting problems of increasing complexity from computational and cognitive neuroscience. In all problems, IBS generally produces lower error in the estimated parameters and maximum log-likelihood values than alternative sampling methods with the same average number of samples. Our results demonstrate the potential of IBS as a practical, robust, and easy to implement method for log-likelihood evaluation when exact techniques are not available.  相似文献   

9.
Brownian motions on coalescent structures have a biological relevance, either as an approximation of the stepwise mutation model for microsatellites, or as a model of spatial evolution considering the locations of individuals at successive generations. We discuss estimation procedures for the dispersal parameter of a Brownian motion defined on coalescent trees. First, we consider the mean square distance unbiased estimator and compute its variance. In a second approach, we introduce a phylogenetic estimator. Given the UPGMA topology, the likelihood of the parameter is computed thanks to a new dynamical programming method. By a proper correction, an unbiased estimator is derived from the pseudomaximum of the likelihood. The last approach consists of computing the likelihood by a Markov chain Monte Carlo sampling method. In the one-dimensional Brownian motion, this method seems less reliable than pseudomaximum-likelihood.  相似文献   

10.
Studies in genetics and ecology often require estimates of relatedness coefficients based on genetic marker data. Many diploid estimators have been developed using either method‐of‐moments or maximum‐likelihood estimates. However, there are no relatedness estimators for polyploids. The development of a moment estimator for polyploids with polysomic inheritance, which simultaneously incorporates the two‐gene relatedness coefficient and various ‘higher‐order’ coefficients, is described here. The performance of the estimator is compared to other estimators under a variety of conditions. When using a small number of loci, the estimator is biased because of an increase in ill‐conditioned matrices. However, the estimator becomes asymptotically unbiased with large numbers of loci. The ambiguity of polyploid heterozygotes (when balanced heterozygotes cannot be distinguished from unbalanced heterozygotes) is also considered; as with low numbers of loci, genotype ambiguity leads to bias. A software, PolyRelatedness , implementing this method and supporting a maximum ploidy of 8 is provided.  相似文献   

11.
M C Wu  K R Bailey 《Biometrics》1989,45(3):939-955
A general linear regression model for the usual least squares estimated rate of change (slope) on censoring time is described as an approximation to account for informative right censoring in estimating and comparing changes of a continuous variable in two groups. Two noniterative estimators for the group slope means, the linear minimum variance unbiased (LMVUB) estimator and the linear minimum mean squared error (LMMSE) estimator, are proposed under this conditional model. In realistic situations, we illustrate that the LMVUB and LMMSE estimators, derived under a simple linear regression model, are quite competitive compared to the pseudo maximum likelihood estimator (PMLE) derived by modeling the censoring probabilities. Generalizations to polynomial response curves and general linear models are also described.  相似文献   

12.
Statistical inference for microarray experiments usually involves the estimation of error variance for each gene. Because the sample size available for each gene is often low, the usual unbiased estimator of the error variance can be unreliable. Shrinkage methods, including empirical Bayes approaches that borrow information across genes to produce more stable estimates, have been developed in recent years. Because the same microarray platform is often used for at least several experiments to study similar biological systems, there is an opportunity to improve variance estimation further by borrowing information not only across genes but also across experiments. We propose a lognormal model for error variances that involves random gene effects and random experiment effects. Based on the model, we develop an empirical Bayes estimator of the error variance for each combination of gene and experiment and call this estimator BAGE because information is Borrowed Across Genes and Experiments. A permutation strategy is used to make inference about the differential expression status of each gene. Simulation studies with data generated from different probability models and real microarray data show that our method outperforms existing approaches.  相似文献   

13.
Computer simulation was used to compare minimum variance quadratic estimation (MIVQUE), minimum norm quadratic unbiased estimation (MINQUE), restricted maximum likelihood (REML), maximum likelihood (ML), and Henderson's Method 3 (HM3) on the basis of variance among estimates, mean square error (MSE), bias and probability of nearness for estimation of both individual variance components and three ratios of variance components. The investigation also compared three procedures for dealing with negative estimates and included the use of both individual observations and plot means as the experimental unit of the analysis. The structure of data simulated (field design, mating designs, genetic architecture and imbalance) represented typical analysis problems in quantitative forest genetics. Results of comparing the estimation techniques demonstrated that: estimates of probability of nearness did not discriminate among techniques; bias was discriminatory among procedures for dealing with negative estimates but not among estimation techniques (except ML); sampling variance among estimates was discriminatory among procedures for dealing with negative estimates, estimation techniques and unit of observation; and MSE provided no additional information to variance of the estimates. HM3 and REML were the closest competitors under these criteria; however, REML demonstrated greater robustness to imbalance. Of the three negative estimate procedures, two are of practical significance and guidelines for their application are presented. Estimates from individual observations were always preferable to those from plot means over the experimental levels of this study.This is Journal Series NO. R-03768 of the Institute of Food and Agricultural Sciences  相似文献   

14.
DeGiorgio M  Jankovic I  Rosenberg NA 《Genetics》2010,186(4):1367-1387
Gene diversity, a commonly used measure of genetic variation, evaluates the proportion of heterozygous individuals expected at a locus in a population, under the assumption of Hardy-Weinberg equilibrium. When using the standard estimator of gene diversity, the inclusion of related or inbred individuals in a sample produces a downward bias. Here, we extend a recently developed estimator shown to be unbiased in a diploid autosomal sample that includes known related or inbred individuals to the general case of arbitrary ploidy. We derive an exact formula for the variance of the new estimator, H, and present an approximation to facilitate evaluation of the variance when each individual is related to at most one other individual in a sample. When examining samples from the human X chromosome, which represent a mixture of haploid and diploid individuals, we find that H performs favorably compared to the standard estimator, both in theoretical computations of mean squared error and in data analysis. We thus propose that H is a useful tool in characterizing gene diversity in samples of arbitrary ploidy that contain related or inbred individuals.  相似文献   

15.
Chen DG  Carter EM  Hubert JJ  Kim PT 《Biometrics》1999,55(4):1038-1043
This article presents a new empirical Bayes estimator (EBE) and a shrinkage estimator for determining the relative potency from several multivariate bioassays by incorporating prior information on the model parameters based on Jeffreys' rules. The EBE can account for any extra variability among the bioassays, and if this extra variability is 0, then the EBE reduces to the maximum likelihood estimator for combinations of multivariate bioassays. The shrinkage estimator turns out to be a compromise of the prior information and the estimator from each multivariate bioassay, with the weights depending on the prior variance.  相似文献   

16.
In this article, we provide a template for the practical implementation of the targeted maximum likelihood estimator for analyzing causal effects of multiple time point interventions, for which the methodology was developed and presented in Part I. In addition, the application of this template is demonstrated in two important estimation problems: estimation of the effect of individualized treatment rules based on marginal structural models for treatment rules, and the effect of a baseline treatment on survival in a randomized clinical trial in which the time till event is subject to right censoring.  相似文献   

17.
Assessment of genetic diversity in a crop germplasm is a vital part of plant breeding. DNA markers such as microsatellite or simple sequence repeat markers have been widely used to estimate the genetic diversity in rice. The present study was carried out to decipher the pattern of genetic diversity in terms of both phenotypic and genotypic variability, and to assess the efficiency of random vis-à-vis QTL linked/gene based simple sequence repeat markers in diversity estimation. A set of 88 rice accessions that included landraces, farmer’s varieties and popular Basmati lines were evaluated for agronomic traits and molecular diversity. The random set of SSR markers included 50 diversity panel markers developed under IRRI’s Generation Challenge Programme (GCP) and the trait-linked/gene based markers comprised of 50 SSR markers reportedly linked to yield and related components. For agronomic traits, significant variability was observed, ranging between the maximum for grains/panicle and the minimum for panicle length. The molecular diversity based grouping indicated that varieties from a common centre were genetically similar, with few exceptions. The trait-linked markers gave an average genetic dissimilarity of 0.45 as against that of 0.37 by random markers, along with an average polymorphic information constant value of 0.48 and 0.41 respectively. The correlation between the kinship matrix generated by trait-linked markers and the phenotype based distance matrix (0.29) was higher than that of random markers (0.19). This establishes the robustness of trait-linked markers over random markers in estimating genetic diversity of rice germplasm.  相似文献   

18.
Best linear unbiased allele-frequency estimation in complex pedigrees   总被引:4,自引:0,他引:4  
McPeek MS  Wu X  Ober C 《Biometrics》2004,60(2):359-367
Many types of genetic analyses depend on estimates of allele frequencies. We consider the problem of allele-frequency estimation based on data from related individuals. The motivation for this work is data collected on the Hutterites, an isolated founder population, so we focus particularly on the case in which the relationships among the sampled individuals are specified by a large, complex pedigree for which maximum likelihood estimation is impractical. For this case, we propose to use the best linear unbiased estimator (BLUE) of allele frequency. We derive this estimator, which is equivalent to the quasi-likelihood estimator for this problem, and we describe an efficient algorithm for computing the estimate and its variance. We show that our estimator has certain desirable small-sample properties in common with the maximum likelihood estimator (MLE) for this problem. We treat both the case when parental origin of each allele is known and when it is unknown. The results are extended to prediction of allele frequency in some set of individuals S based on genotype data collected on a set of individuals R. We compare the mean-squared error of the BLUE, the commonly used naive estimator (sample frequency) and the MLE when the latter is feasible to calculate. The results indicate that although the MLE performs the best of the three, the BLUE is close in performance to the MLE and is substantially easier to calculate, making it particularly useful for large complex pedigrees in which MLE calculation is impractical or infeasible. We apply our method to allele-frequency estimation in a Hutterite data set.  相似文献   

19.
BEST (1974) found the variance of the minimum variance unbiased estimator of the Bernoulli parameter with an inverse sample. Noting its intricacy MIKULSKI and SMITH (1976) found bounds on this variance. SATHE (1977) developed closer bounds on it. In this paper still closer upper bound on the variance is achieved using Jensen's inequality.  相似文献   

20.
Although F(ST) is widely used as a measure of population structure, it has been criticized recently because of its dependency on within-population diversity. This dependency can lead to difficulties in interpretation and in the comparison of estimates among species or among loci and has led to the development of two replacement statistics, F'(ST) and D. F'(ST) is the normal F(ST) standardized by the maximum value it can obtain, given the observed within-population diversity. D uses a multiplicative partitioning of diversity, based on the effective number of alleles rather than on the expected heterozygosity. In this study, we review the relationships between the three classes of statistics (F(ST), F'(ST) and D), their estimation and their properties. We illustrate the relationships between the statistics using a data set of estimates from 84 species taken from the last 4 years of Molecular Ecology. As with F(ST), unbiased estimators are available for the two new statistics D and F'(ST). Here, we develop a new unbiased F'(ST) estimator based on G(ST), which we call G'(ST). However, F'(ST) can be calculated using any F(ST) estimator for which the maximum value can be obtained. As all three statistics have their advantages and their drawbacks, we recommend continued use of F(ST) in combination with either F'(ST) or D. In most cases, F'(ST) would be the best choice among the latter two as it is most suited for inferences of the influence of demographic processes such as genetic drift and migration on genetic population structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号