首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Aspects of parameter estimation in ascertainment sampling schemes.   总被引:6,自引:6,他引:0       下载免费PDF全文
It has recently been suggested that ascertainment sampling estimation procedures commonly used are not fully efficient in that the number of unobserved families is an unknown parameter that should be estimated (contrary to common practice) along with the genetic parameters for fully efficient estimation. It has also been suggested that the frequency distribution of family size contains unknown parameters that should similarly be estimated with the genetic parameters. These two suggestions are considered in this paper. It is shown by means of an equivalence theorem that in both cases the estimates and their variances obtained by adopting the suggested procedure are identical with those found by ignoring the unobserved families and by ignoring the family-size distribution. This demonstration leads to a formal justification of further procedures, in particular: (1) use of "method-of-moments" estimators, (2) ignoring the ascertainment scheme in some cases when estimating parameters, and (3) forming estimates of parameters when various parts of the data are obtained by different ascertainment schemes.  相似文献   

2.
Procedures to estimate the genetic segregation parameter when ascertainment of families is incomplete, have previously relied on iterative computer algorithms since estimators with closed form are lacking. We now present the Minimum Variance Unbiased Estimator for the segregation parameter under any ascertainment probability. This estimator assumes a simple form when ascertainment is complete. We also present a simple estimator, akin to Li and Mantel's (1968) estimator, but without the restriction that ascertainment be complete. The performance of these estimators is compared with respect to asymptotic efficiency. We also provide tables that define the required number of families of a given size that need to be sampled to achieve a specific power for testing simple hypothesis on the segregation parameter.  相似文献   

3.
The ascertainment problem arises when families are sampled by a nonrandom process and some assumption about this sampling process must be made in order to estimate genetic parameters. Under classical ascertainment assumptions, estimation of genetic parameters cannot be separated from estimation of the parameters of the ascertainment process, so that any misspecification of the ascertainment process causes biases in estimation of the genetic parameters. Ewens and Shute proposed a resolution to this problem, involving conditioning the likelihood of the sample on the part of the data which is "relevant to ascertainment." The usefulness of this approach can only be assessed by examining the properties (in particular, bias and standard error) of the estimates which arise by using it for a wide range of parameter values and family size distributions and then comparing these biases and standard errors with those arising under classical ascertainment procedures. These comparisons are carried out in the present paper, and we also compare the proposed method with procedures which condition on, or ignore, parts of the data.  相似文献   

4.
Family-size distribution and Ewens'' equivalence theorem.   总被引:2,自引:2,他引:0  
Segregation analysis of a data set containing nuclear families of more than one sibship size is considered, and two different formulations of the likelihood are examined. One is the "separate-multinomials" formulation, which treats each family size as representing a separate multinomial distribution; the other is the "grand-multinomial" formulation, which treats the entire data set as representing one distribution. It is shown that these two formulations are equivalent, if and only if the population distribution of family sizes is completely unknown. However, if anything is known about the family-size distribution, the grand-multinomial formulation, although more cumbersome, makes more complete use of the data; moreover, it enables the use of one-child families in a segregation analysis. The relationship of this work to Ewens' equivalence theorem concerning "unconditional" and "conditional" likelihoods is discussed. The findings are illustrated with a simple example, and their practical relevance to real-life segregation analysis is discussed.  相似文献   

5.
The problem of ascertainment for linkage analysis.   总被引:2,自引:0,他引:2       下载免费PDF全文
It is generally believed that ascertainment corrections are unnecessary in linkage analysis, provided individuals are selected for study solely on the basis of trait phenotype and not on the basis of marker genotype. The theoretical rationale for this is that standard linkage analytic methods involve conditioning likelihoods on all the trait data, which may be viewed as an application of the ascertainment assumption-free (AAF) method of Ewens and Shute. In this paper, we show that when the observed pedigree structure depends on which relatives within a pedigree happen to have been the probands (proband-dependent, or PD, sampling) conditioning on all the trait data is not a valid application of the AAF method and will result in asymptotically biased estimates of genetic parameters (except under single ascertainment). Furthermore, this result holds even if the recombination fraction R is the only parameter of interest. Since the lod score is proportional to the likelihood of the marker data conditional on all the trait data, this means that when data are obtained under PD sampling the lod score will yield asymptotically biased estimates of R, and that so-called mod scores (i.e., lod scores maximized over both R and parameters theta of the trait distribution) will yield asymptotically biased estimates of R and theta. Furthermore, the problem appears to be intractable, in the sense that it is not possible to formulate the correct likelihood conditional on observed pedigree structure. In this paper we do not investigate the numerical magnitude of the bias, which may be small in many situations. On the other hand, virtually all linkage data sets are collected under PD sampling. Thus, the existence of this bias will be the rule rather than the exception in the usual applications.  相似文献   

6.
T D Tosteson  B Rosner  S Redline 《Biometrics》1991,47(4):1257-1265
Estimation is considered for the class of conditional logistic regression models for clustered binary data proposed by Qu et al. (Communications in Statistics, Series A 16, 3447-3476, 1987) when clusters are sampled on the basis of the outcome for one or more cluster members. The problem is suggested by data from a study designed to investigate familial aggregation of sleep disorders. After appropriate consideration of the mode of ascertainment of "cases" and "controls," it is shown that the model is preserved under this form of sampling, and a method of estimation is presented. The inconsistency of two alternative methods is demonstrated, and an example is provided.  相似文献   

7.
Kolassa JE  Tanner MA 《Biometrics》1999,55(1):246-251
This article presents an algorithm for approximate frequentist conditional inference on two or more parameters for any regression model in the Generalized Linear Model (GLIM) family. We thereby extend highly accurate inference beyond the cases of logistic regression and contingency tables implimented in commercially available software. The method makes use of the double saddlepoint approximations of Skovgaard (1987, Journal of Applied Probability 24, 875-887) and Jensen (1992, Biometrika 79, 693-703) to the conditional cumulative distribution function of a sufficient statistic given the remaining sufficient statistics. This approximation is then used in conjunction with noniterative Monte Carlo methods to generate a sample from a distribution that approximates the joint distribution of the sufficient statistics associated with the parameters of interest conditional on the observed values of the sufficient statistics associated with the nuisance parameters. This algorithm is an alternate approach to that presented by Kolassa and Tanner (1994, Journal of the American Statistical Association 89, 697-702), in which a Markov chain is generated whose equilibrium distribution under certain regularity conditions approximates the joint distribution of interest. In Kolassa and Tanner (1994), the Gibbs sampler was used in conjunction with these univariate conditional distribution function approximations. The method of this paper does not require the construction and simulation of a Markov chain, thus avoiding the need to develop regularity conditions under which the algorithm converges and the need for the data analyst to check convergence of the particular chain. Examples involving logistic and truncated Poisson regression are presented.  相似文献   

8.
We propose a method for testing gene-environment (G × E) interactions on a complex trait in family-based studies in which a phenotypic ascertainment criterion has been imposed. This novel approach employs G-estimation, a semiparametric estimation technique from the causal inference literature, to avoid modeling of the association between the environmental exposure and the phenotype, to gain robustness against unmeasured confounding due to population substructure, and to acknowledge the ascertainment conditions. The proposed test allows for incomplete parental genotypes. It is compared by simulation studies to an analogous conditional likelihood-based approach and to the QBAT-I test, which also invokes the G-estimation principle but ignores ascertainment. We apply our approach to a study of chronic obstructive pulmonary disorder.  相似文献   

9.
It has been shown that the classical binomial form of ascertainment, assuming a constant probability pi that any affected individual may become a proband for his pedigree, cannot describe a rather wide range of ascertainment procedures that might arise in practice. Some more general heuristic ascertainment formulas might then be preferred, and in this paper we consider the probabilistic basis for these formulas. We retain the binomial assumption of the classical scheme but allow the ascertainment probability to depend on the number of potential probands per pedigree. This probability can be expressed by an increasing or a decreasing function of that number. Various illustrations are given and situations where the "cooperative" binomial scheme should be valuable are discussed.  相似文献   

10.
Detection bias in recessive ascertainment is generally considered to be confined in a narrow range between unbiased truncate ascertainment and single ascertainment, where methods of segregation analysis are established. While there are arguments for an extended range of analysis, a deflated detection progression below the unbiased level is still being considered as theoretical ground or ignored as sporadics. I show here a method of gauging the ascertainment levels of surveyed data in a geometric continuum. The method is valid for recessive segregation at any ascertainment level and in simplex or multiplex sibships of whatever degree of truncation. Four previously published surveys are used to show conformation with real data and the existence of detection trends spanning the range from the unsuspected very depressed bias level to the inflated level above single ascertainment.  相似文献   

11.
This paper proposes a semiparametric methodology for modeling multivariate and conditional distributions. We first build a multivariate distribution whose dependence structure is induced by a Gaussian copula and whose marginal distributions are estimated nonparametrically via mixtures of B‐spline densities. The conditional distribution of a given variable is obtained in closed form from this multivariate distribution. We take a Bayesian approach, using Markov chain Monte Carlo methods for inference. We study the frequentist properties of the proposed methodology via simulation and apply the method to estimation of conditional densities of summary statistics, used for computing conditional local false discovery rates, from genetic association studies of schizophrenia and cardiovascular disease risk factors.  相似文献   

12.
Bagiella E 《Biometrics》2006,62(1):54-60
Age at ascertainment from prevalence case-control data identifies the age-specific odds of disease. When age at onset is available from the cases, the conditional distribution of age at onset, given that disease occurs, is identifiable. Combining both kinds of information by introducing a multiplicative intercept allows identification of the marginal distribution of age at onset. Here, the approach is extended to the two-sample setting through a generalization of the multiplicative intercept model. The efficiency of the approach is explored and a test statistic based on the integrated difference between distribution function estimates is proposed. An approach to regularization of the likelihood is discussed. The methods are illustrated through an application to data on colorectal polyps obtained from a case-control study of individuals undergoing colonoscopy.  相似文献   

13.
Summary We examine situations where interest lies in the conditional association between outcome and exposure variables, given potential confounding variables. Concern arises that some potential confounders may not be measured accurately, whereas others may not be measured at all. Some form of sensitivity analysis might be employed, to assess how this limitation in available data impacts inference. A Bayesian approach to sensitivity analysis is straightforward in concept: a prior distribution is formed to encapsulate plausible relationships between unobserved and observed variables, and posterior inference about the conditional exposure–disease relationship then follows. In practice, though, it can be challenging to form such a prior distribution in both a realistic and simple manner. Moreover, it can be difficult to develop an attendant Markov chain Monte Carlo (MCMC) algorithm that will work effectively on a posterior distribution arising from a highly nonidentified model. In this article, a simple prior distribution for acknowledging both poorly measured and unmeasured confounding variables is developed. It requires that only a small number of hyperparameters be set by the user. Moreover, a particular computational approach for posterior inference is developed, because application of MCMC in a standard manner is seen to be ineffective in this problem.  相似文献   

14.
A Bayesian solution for making inferences about segregation parameters with no information about the ascertainment is presented. Inferences about the segregation probability and the probability of being sporadic are made through the posterior marginal distribution of these parameters after integrating out the ascertainment probability, the nuisance parameter. The method was tested with real and simulated data and performed well. Original Fanconi anemia data, for which no information about the ascertainment was available, were then analyzed, with results that confirmed a monogenic autosomal recessive mode of inheritance.  相似文献   

15.
Uncertainty about the ascertainment of human family data leads to a need for robust methods for estimating genetic and environmental effects. This in turn leads to a need for efficient techniques for estimating model parameters for data generated under one parametric model but analyzed under a second model. If the two models correspond to different ascertainment schemes for the same exponential family, simple formulas for the asymptotic means and standard errors of both conditional and unconditional MLEs can be derived. In an example for continuous sibship data, these formulas show that estimates derived from conditioning on proband value have greater asymptotic bias than two other estimators. Similarly, either conditioning on proband value or conditioning on the number of affected family members resulted in biases of up to 30% when ascertainment depended on the values of more than one affected family member.  相似文献   

16.
We revisit the usual conditional likelihood for stratum-matched case-control studies and consider three alternatives that may be more appropriate for family-based gene-characterization studies: First, the prospective likelihood, that is, Pr(D/G,A second, the retrospective likelihood, Pr(G/D); and third, the ascertainment-corrected joint likelihood, Pr(D,G/A). These likelihoods provide unbiased estimators of genetic relative risk parameters, as well as population allele frequencies and baseline risks. The parameter estimates based on the retrospective likelihood remain unbiased even when the ascertainment scheme cannot be modeled, as long as ascertainment only depends on families' phenotypes. Despite the need to estimate additional parameters, the prospective, retrospective, and joint likelihoods can lead to considerable gains in efficiency, relative to the conditional likelihood, when estimating genetic relative risk. This is true if baseline risks and allele frequencies can be assumed to be homogeneous. In the presence of heterogeneity, however, the parameter estimates assuming homogeneity can be seriously biased. We discuss the extent of this problem and present a mixed models approach for providing consistent parameter estimates when baseline risks and allele frequencies are heterogeneous. The efficiency gains of the mixed-model prospective, retrospective, and joint likelihoods relative to the efficiency of conditional likelihood are small in the situations presented here.  相似文献   

17.
The problem of exact conditional inference for discrete multivariate case-control data has two forms. The first is grouped case-control data, where Monte Carlo computations can be done using the importance sampling method of Booth and Butler (1999, Biometrika86, 321-332), or a proposed alternative sequential importance sampling method. The second form is matched case-control data. For this analysis we propose a new exact sampling method based on the conditional-Poisson distribution for conditional testing with one binary and one integral ordered covariate. This method makes computations on data sets with large numbers of matched sets fast and accurate. We provide detailed derivation of the constraints and conditional distributions for conditional inference on grouped and matched data. The methods are illustrated on several new and old data sets.  相似文献   

18.
We derive the conditional probabilities for estimating the sex ratio in families ascertained through affected males for the study of X-linked recessive diseases. These conditional probabilities correct for the fact that the probability that a family will be ascertained increases with the number of males in the family. Data from four published studies for X-linked ichthyosis vulgaris are analyzed, three having an excess of males and one having a highly statistically significant excess of males. It is not known if this difference in the two samples represents a biological difference between the two populations or an unrecognized ascertainment bias.  相似文献   

19.
Feng R  Zhang H 《Human genetics》2006,119(4):429-435
Most genetic studies recruit high risk families and the discoveries are based on non-random selected groups. We must consider the consequences of this ascertainment process in order to apply the results of genetic research to the general population. In previous reports, we developed a latent variable model to assess the familial aggregation and inheritability of ordinal-scaled diseases, and found a major gene component of alcoholism after applying the model to the data from the Yale family study of comorbidity of alcoholism and anxiety (YFSCAA). In this report, we examine the ascertainment effects on parameter estimates and correct potential bias in the latent variable model. The simulation studies for various ascertainment schemes suggest that our ascertainment adjustment is necessary and effective. We also find that the estimated effects are relatively unbiased for the particular ascertainment scheme used in the YFSCAA, which assures the validity of our earlier conclusion.  相似文献   

20.
S. Xu 《Genetics》1996,144(4):1951-1960
The proportion of alleles identical by descent (IBD) determines the genetic covariance between relatives, and thus is crucial in estimating genetic variances of quantitative trait loci (QTL). However, IBD proportions at QTL are unobservable and must be inferred from marker information. The conventional method of QTL variance analysis maximizes the likelihood function by replacing the missing IBDs by their conditional expectations (the expectation method), while in fact the full likelihood function should take into account the conditional distribution of IBDs (the distribution method). The distribution method for families of more than two sibs has not been obvious because there are n(n - 1)/2 IBD variables in a family of size n, forming an n X n symmetrical matrix. In this paper, I use four binary variables, where each indicates the event that an allele from one of the four grandparents has passed to the individual. The IBD proportion between any two sibs is then expressed as a function of the indicators. Subsequently, the joint distribution of the IBD matrix is derived from the distribution of the indicator variables. Given the joint distribution of the unknown IBDs, a method to compute the full likelihood function is developed for families of arbitrary sizes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号