首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We consider the question: In a segregation analysis, can knowledge of the family-size distribution (FSD) in the population from which a sample is drawn improve the estimators of genetic parameters? In other words, should one incorporate the population FSD into a segregation analysis if one knows it? If so, then under what circumstances? And how much improvement may result? We examine the variance and bias of the maximum likelihood estimators both asymptotically and in finite samples. We consider Poisson and geometric FSDs, as well as a simple two-valued FSD in which all families in the population have either one or two children. We limit our study to a simple genetic model with truncate selection. We find that if the FSD is completely specified, then the asymptotic variance of the estimator may be reduced by as much as 5%-10%, especially when the FSD is heavily skewed toward small families. Results in small samples are less clear-cut. For some of the simple two-valued FSDs, the variance of the estimator in small samples of one- and two-child families may actually be increased slightly when the FSD is included in the analysis. If one knows only the statistical form of the FSD, but not its parameter, then the estimator is improved only minutely. Our study also underlines the fact that results derived from asymptotic maximum likelihood theory do not necessarily hold in small samples. We conclude that in most practical applications it is not worth incorporating the FSD into a segregation analysis. However, this practice may be justified under special circumstances where the FSD is completely specified, without error, and the population consists overwhelmingly of small families.  相似文献   

2.
Yao YC  Tai JJ 《Biometrics》2000,56(3):795-800
Segregation ratio estimation has long been important in human genetics. A simple truncated binomial model is considered that assumes complete ascertainment and a deterministic genotype-phenotype relationship. A simple but intuitively appealing estimator of the segregation ratio, previously proposed, is shown to have a negative bias. It is also shown that the bias of this estimator can be largely reduced via a randomization device, resulting in a new estimator that has the same large-sample behavior but with a negligible bias (decaying at a geometric rate). Numerical results are given to show the small-sample performance of this new estimator. An extension to incomplete ascertainment is also considered.  相似文献   

3.
We consider the problem of estimating segregation ratios in families based on ascertainment through affected children, formulate it as an incomplete problem and work out the EM algorithm for maximum likelihood estimation of segregation ratios. We treat both the cases of known and unknown ascertainment probability. We also derive expressions for the covariance matrix of the estimators suitable for computing along with the EM algorithm. We illustrate the method with an example, compare the computational effort with that required in using the scoring method and argue that the EM algorithm is simpler.  相似文献   

4.
The effect of proband designation on segregation analysis   总被引:5,自引:4,他引:1       下载免费PDF全文
In many family studies, it is often difficult to know exactly how the families were ascertained. Even if known, the circumstances under which the families came to the attention of the study may violate the assumptions of classical ascertainment bias correction. The purpose of this work was to investigate the effect on segregation analysis of violations of the assumptions of the classical ascertainment model. We simulated family data generated under a simple recessive model of inheritance. We then ascertained families under different "scenarios." These scenarios were designed to simulate actual conditions under which families come to the attention of-and then interact with-a clinic or genetic study. We show that how one designates probands, which one must do under the classical ascertainment model, can influence parameter estimation and hypothesis testing. We demonstrate that, in some cases, there may be no "correct" way to designate probands. Further, we show that interactions within the family, the conditions under which the genetic study must function, and even social influences can have a profound effect on segregation analysis. We also propose a method for dealing with the ascertainment problem that is applicable to almost any study situation.  相似文献   

5.
Shwachman-Diamond syndrome is a rare disorder of unknown cause. Reports have indicated the occurrence of affected siblings, but formal segregation analysis has not been performed. In families collected for genetic studies, the mean paternal age and mean difference in parental ages were found to be consistent with the general population. We determined estimates of segregation proportion in a cohort of 84 patients with complete sibship data under the assumption of complete ascertainment, using the Li and Mantel estimator, and of single ascertainment with the Davie modification. A third estimate was also computed with the expectation-maximization (EM) algorithm. All three estimates supported an autosomal recessive mode of inheritance, but complete ascertainment was found to be unlikely. Although there are no overt signs of disease in adult carriers (parents), the use of serum trypsinogen levels to indicate exocrine pancreatic dysfunction was evaluated as a potential measure for heterozygote expression. No consistent differences were found in levels between parents and a normal control population. Although genetic heterogeneity cannot be excluded, our results indicate that simulation and genetic analyses of Shwachman-Diamond syndrome should consider a recessive model of inheritance.  相似文献   

6.
The use of methodologies such as RAPD and AFLP for studying genetic variation in natural populations is widespread in the ecology community. Because data generated using these methods exhibit dominance, their statistical treatment is less straightforward. Several estimators have been proposed for estimating population genetic parameters, assuming simple random sampling and the Hardy-Weinberg (HW) law. The merits of these estimators remain unclear because no comparative studies of their theoretical properties have been carried out. Furthermore, ascertainment bias has not been explicitly modelled. Here, we present a comparison of a set of candidate estimators of null allele frequency (q), locus-specific heterozygosity (h) and average heterozygosity () in terms of their bias, standard error, and root mean square error (RMSE). For estimating q and h, we show that none of the estimators considered has the least RMSE over the parameter space. Our proposed zero-correction procedure, however, generally leads to estimators with improved RMSE. Assuming a beta model for the distribution of null homozygote proportions, we show how correction for ascertainment bias can be carried out using a linear transform of the sample average of h and the truncated beta-binomial likelihood. Simulation results indicate that the maximum likelihood and empirical Bayes estimator of have negligible bias and similar RMSE. Ascertainment bias in estimators of is most pronounced when the beta distribution is J-shaped and negligible when the latter is inverse J-shaped. The validity of the current findings depends importantly on the HW assumption-a point that we illustrate using data from two published studies.  相似文献   

7.
Cannings and Thompson suggested conditioning on the phenotypes of the probands to correct for ascertainment in the analysis of pedigree data. The method assumes single ascertainment and can be expected to yield asymptotically biased parameter estimates except in this specific case. However, because the method is easy to apply, we investigated the degree of bias in the more typical situation of multiple ascertainment, in the hope that the bias might be small and that the method could be applied more generally. To explore the utility of conditioning on probands to correct for multiple ascertainment, we calculated the asymptotic value of the segregation ratio for two versions of the simple Mendelian segregation model on sibship data. For both versions, we found that this asymptotic value decreased approximately linearly as the ascertainment probability increased. When ascertainment was complete, the segregation-ratio estimates were zero, not just asymptotically but for finite sample size as well. In some cases, conditioning on probands actually resulted in greater parameter bias than no ascertainment correction at all. These results hold for a variety of sibship-size distributions, several modes of inheritance, and a wide range of population prevalences of affected individuals.  相似文献   

8.
Molecular marker data provide a means of circumventing the problem of not knowing the population structure of a natural population, as observed similarities between a pair's genotypes provide information on their genetic relationship. Numerous method-of-moment (MOM) estimators have been developed for estimating relationship coefficients using this information. Here, I present a simplified form of Wang's 2002 relationship estimator that is not dependent upon a previously required weighting scheme, thus improving the efficiency of the estimator when used with genuinely related pairs. The new estimator is compared against other estimators under a range of conditions, including situations where the parameter estimates are truncated to lie within the legitimate parameter space. The advantages of the new estimator are most notable for the two-gene coefficient of relatedness. Truncating the MOM estimators results in parameter estimates whose properties are similar to maximum likelihood estimates, with them having generally lower sampling variances, but being biased.  相似文献   

9.
Aspects of parameter estimation in ascertainment sampling schemes.   总被引:6,自引:6,他引:0       下载免费PDF全文
It has recently been suggested that ascertainment sampling estimation procedures commonly used are not fully efficient in that the number of unobserved families is an unknown parameter that should be estimated (contrary to common practice) along with the genetic parameters for fully efficient estimation. It has also been suggested that the frequency distribution of family size contains unknown parameters that should similarly be estimated with the genetic parameters. These two suggestions are considered in this paper. It is shown by means of an equivalence theorem that in both cases the estimates and their variances obtained by adopting the suggested procedure are identical with those found by ignoring the unobserved families and by ignoring the family-size distribution. This demonstration leads to a formal justification of further procedures, in particular: (1) use of "method-of-moments" estimators, (2) ignoring the ascertainment scheme in some cases when estimating parameters, and (3) forming estimates of parameters when various parts of the data are obtained by different ascertainment schemes.  相似文献   

10.
We consider the estimation of the scaled mutation parameter θ, which is one of the parameters of key interest in population genetics. We provide a general result showing when estimators of θ can be improved using shrinkage when taking the mean squared error as the measure of performance. As a consequence, we show that Watterson’s estimator is inadmissible, and propose an alternative shrinkage-based estimator that is easy to calculate and has a smaller mean squared error than Watterson’s estimator for all possible parameter values 0<θ<. This estimator is admissible in the class of all linear estimators. We then derive improved versions for other estimators of θ, including the MLE. We also investigate how an improvement can be obtained both when combining information from several independent loci and when explicitly taking into account recombination. A simulation study provides information about the amount of improvement achieved by our alternative estimators.  相似文献   

11.
A resolution of the ascertainment sampling problem. III. Pedigrees.   总被引:4,自引:3,他引:1       下载免费PDF全文
When nuclear families are sampled by an ascertainment procedure whose properties are not known, biased estimates of genetic parameters will arise if an incorrect specification of the ascertainment procedure is made. Elsewhere we have put forward a resolution of this problem by introducing an ascertainment-assumption-free (AAF) method, for nuclear family data, which gives asymptotically unbiased estimators no matter what the true nature of the ascertainment process. In the present paper we extend this method to cover pedigree data. Problems that arise with pedigrees but not with families--for example, the question of which families in a pedigree are "ascertainable"--are also considered. Comparisons of numerical results for pedigrees and nuclear families are also made.  相似文献   

12.
Several different methodologies for parameter estimation under various ascertainment sampling schemes have been proposed in the past. In this article, some of the methodologies that have been proposed for independent sibships under the classical segregation analysis model are synthesized, and the general likelihoods derived for single, multiple and complete ascertainment. The issue of incorporating the sibship size distribution into the analysis is addressed, and the effect of conditioning the likelihood on the observed sibship sizes is discussed. It is shown that when the number of probands in a sibship is not specified, the corresponding likelihood can be used for a broader class of ascertainment schemes than is subsumed by the classical model.  相似文献   

13.
The segregation of classical and nonclassical 21-hydroxylase deficiency (21-OHD) and its linkage to HLA-B was investigated in 220 families. First, the surprisingly high frequency of the nonclassical 21-OHD gene estimated elsewhere was confirmed using a different methodology which avoided particular assumptions concerning the classification of an individual''s genotype. In the present study the gene frequency was found to be .103 +/- .020 in an ethnically pooled sample and was as high as .223 +/- .062 among Ashkenazi Jews. Second, the segregation analysis of families ascertained through a nonclassical 21-OHD proband and those ascertained through a classical 21-OHD proband showed essentially identical results. A partial recessive model with no recombination between 21-OHD and HLA-B fitted the data better than did a complete recessive model with approximately 0.5% recombination between 21-OHD and HLA-B. The support for the partial over the complete recessive model depended on the assumed ascertainment probability, an unknown parameter in these data. Four families provided most of the evidence against the complete recessive model. All these included an unaffected sib who shared both HLA-B specificities in common with the affected proband. Possible explanations for the condition in these families include recombination, gene conversion, mutation in one of the parental gametes, or technical errors.  相似文献   

14.
Tallmon DA  Luikart G  Beaumont MA 《Genetics》2004,167(2):977-988
We describe and evaluate a new estimator of the effective population size (N(e)), a critical parameter in evolutionary and conservation biology. This new "SummStat" N(e) estimator is based upon the use of summary statistics in an approximate Bayesian computation framework to infer N(e). Simulations of a Wright-Fisher population with known N(e) show that the SummStat estimator is useful across a realistic range of individuals and loci sampled, generations between samples, and N(e) values. We also address the paucity of information about the relative performance of N(e) estimators by comparing the SummStat estimator to two recently developed likelihood-based estimators and a traditional moment-based estimator. The SummStat estimator is the least biased of the four estimators compared. In 32 of 36 parameter combinations investigated using initial allele frequencies drawn from a Dirichlet distribution, it has the lowest bias. The relative mean square error (RMSE) of the SummStat estimator was generally intermediate to the others. All of the estimators had RMSE > 1 when small samples (n = 20, five loci) were collected a generation apart. In contrast, when samples were separated by three or more generations and N(e) < or = 50, the SummStat and likelihood-based estimators all had greatly reduced RMSE. Under the conditions simulated, SummStat confidence intervals were more conservative than the likelihood-based estimators and more likely to include true N(e). The greatest strength of the SummStat estimator is its flexible structure. This flexibility allows it to incorporate any potentially informative summary statistic from population genetic data.  相似文献   

15.
Here we present analytical studies to evaluate the relative efficiency of commonly used penetrance estimators using linkage designs. We investigated three different methods of estimating penetrance using sib pairs: Maximum likehood estimation (MLE) with trait information alone, MLE with both trait and marker information and the MOD score approach. Modeling sib pairs with unknown phase, we evaluated the asymptotic relative efficiency between estimators under either random sampling or single ascertainment for an autosomal dominant or recessive disease. We then provide plots of the asymptotic relative efficiency, enabling researchers to easily determine regions where the MOD score or segregation alone performs with comparable efficiency relative to joint segregation and linkage.  相似文献   

16.
Doubly robust estimation in missing data and causal inference models   总被引:3,自引:0,他引:3  
Bang H  Robins JM 《Biometrics》2005,61(4):962-973
The goal of this article is to construct doubly robust (DR) estimators in ignorable missing data and causal inference models. In a missing data model, an estimator is DR if it remains consistent when either (but not necessarily both) a model for the missingness mechanism or a model for the distribution of the complete data is correctly specified. Because with observational data one can never be sure that either a missingness model or a complete data model is correct, perhaps the best that can be hoped for is to find a DR estimator. DR estimators, in contrast to standard likelihood-based or (nonaugmented) inverse probability-weighted estimators, give the analyst two chances, instead of only one, to make a valid inference. In a causal inference model, an estimator is DR if it remains consistent when either a model for the treatment assignment mechanism or a model for the distribution of the counterfactual data is correctly specified. Because with observational data one can never be sure that a model for the treatment assignment mechanism or a model for the counterfactual data is correct, inference based on DR estimators should improve upon previous approaches. Indeed, we present the results of simulation studies which demonstrate that the finite sample performance of DR estimators is as impressive as theory would predict. The proposed method is applied to a cardiovascular clinical trial.  相似文献   

17.
The problem of ascertainment in segregation analysis arises when families are selected for study through ascertainment of affected individuals. In this case, ascertainment must be corrected for in data analysis. However, methods for ascertainment correction are not available for many common sampling schemes, e.g., sequential sampling of extended pedigrees (except in the case of "single" selection). Concerns about whether ascertainment correction is even required for large pedigrees, about whether and how multiple probands in the same pedigree can be taken into account properly, and about how to apply sequential sampling strategies have occupied many investigators in recent years. We address these concerns by reconsidering a central issue, namely, how to handle pedigree structure (including size). We introduce a new distinction, between sampling in such a way that observed pedigree structure does not depend on which pedigree members are probands (proband-independent [PI] sampling) and sampling in such a way that observed pedigree structure does depend on who are the probands (proband-dependent [PD] sampling). This distinction corresponds roughly (but not exactly) to the distinction between fixed-structure and sequential sampling. We show that conditioning on observed pedigree structure in ascertained data sets obtained under PD sampling is not in general correct (with the exception of "single" selection), while PI sampling of pedigree structures larger than simple sibships is generally not possible. Yet, in practice one has little choice but to condition on observed pedigree structure. We conclude that the problem of genetic modeling in ascertained data sets is, in most situations, literally intractable. We recommend that future efforts focus on the development of robust approximate approaches to the problem.  相似文献   

18.
A simple linear regression model is considered where the independent variable assumes only a finite number of values and the response variable is randomly right censored. However, the censoring distribution may depend on the covariate values. A class of noniterative estimators for the slope parameter, namely, the noniterative unrestricted estimator, noniterative restricted estimator and noniterative improved pretest estimator are proposed. The asymptotic bias and mean squared errors of the proposed estimators are derived and compared. The relative dominance picture of the estimators is investigated. A simulation study is also performed to asses the properties of the various estimators for small samples.  相似文献   

19.
We revisit the usual conditional likelihood for stratum-matched case-control studies and consider three alternatives that may be more appropriate for family-based gene-characterization studies: First, the prospective likelihood, that is, Pr(D/G,A second, the retrospective likelihood, Pr(G/D); and third, the ascertainment-corrected joint likelihood, Pr(D,G/A). These likelihoods provide unbiased estimators of genetic relative risk parameters, as well as population allele frequencies and baseline risks. The parameter estimates based on the retrospective likelihood remain unbiased even when the ascertainment scheme cannot be modeled, as long as ascertainment only depends on families' phenotypes. Despite the need to estimate additional parameters, the prospective, retrospective, and joint likelihoods can lead to considerable gains in efficiency, relative to the conditional likelihood, when estimating genetic relative risk. This is true if baseline risks and allele frequencies can be assumed to be homogeneous. In the presence of heterogeneity, however, the parameter estimates assuming homogeneity can be seriously biased. We discuss the extent of this problem and present a mixed models approach for providing consistent parameter estimates when baseline risks and allele frequencies are heterogeneous. The efficiency gains of the mixed-model prospective, retrospective, and joint likelihoods relative to the efficiency of conditional likelihood are small in the situations presented here.  相似文献   

20.
Important aspects of population evolution have been investigated using nucleotide sequences. Under the neutral Wright–Fisher model, the scaled mutation rate represents twice the average number of new mutations per generations and it is one of the key parameters in population genetics. In this study, we present various methods of estimation of this parameter, analytical studies of their asymptotic behavior as well as comparisons of the distribution's behavior of these estimators through simulations. As knowledge of the genealogy is needed to estimate the maximum likelihood estimator (MLE), an application with real data is also presented, using jackknife to correct the bias of the MLE, which can be generated by the estimation of the tree. We proved analytically that the Waterson's estimator and the MLE are asymptotically equivalent with the same rate of convergence to normality. Furthermore, we showed that the MLE has a better rate of convergence than Waterson's estimator for values of the parameter greater than one and this relationship is reversed when the parameter is less than one.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号