首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Li M  Boehnke M  Abecasis GR  Song PX 《Genetics》2006,173(4):2317-2327
Mapping and identifying variants that influence quantitative traits is an important problem for genetic studies. Traditional QTL mapping relies on a variance-components (VC) approach with the key assumption that the trait values in a family follow a multivariate normal distribution. Violation of this assumption can lead to inflated type I error, reduced power, and biased parameter estimates. To accommodate nonnormally distributed data, we developed and implemented a modified VC method, which we call the "copula VC method," that directly models the nonnormal distribution using Gaussian copulas. The copula VC method allows the analysis of continuous, discrete, and censored trait data, and the standard VC method is a special case when the data are distributed as multivariate normal. Through the use of link functions, the copula VC method can easily incorporate covariates. We use computer simulations to show that the proposed method yields unbiased parameter estimates, correct type I error rates, and improved power for testing linkage with a variety of nonnormal traits as compared with the standard VC and the regression-based methods.  相似文献   

2.
Quantitative trait loci (QTL) are usually searched for using classical interval mapping methods which assume that the trait of interest follows a normal distribution. However, these methods cannot take into account features of most survival data such as a non-normal distribution and the presence of censored data. We propose two new QTL detection approaches which allow the consideration of censored data. One interval mapping method uses a Weibull model (W), which is popular in parametrical modelling of survival traits, and the other uses a Cox model (C), which avoids making any assumption on the trait distribution. Data were simulated following the structure of a published experiment. Using simulated data, we compare W, C and a classical interval mapping method using a Gaussian model on uncensored data (G) or on all data (G'=censored data analysed as though records were uncensored). An adequate mathematical transformation was used for all parametric methods (G, G' and W). When data were not censored, the four methods gave similar results. However, when some data were censored, the power of QTL detection and accuracy of QTL location and of estimation of QTL effects for G decreased considerably with censoring, particularly when censoring was at a fixed date. This decrease with censoring was observed also with G', but it was less severe. Censoring had a negligible effect on results obtained with the W and C methods.  相似文献   

3.
Association mapping can be a powerful tool for detecting quantitative trait loci (QTLs) without requiring line-crossing experiments. We previously proposed a Bayesian approach for simultaneously mapping multiple QTLs by a regression method that directly incorporates estimates of the population structure. In the present study, we extended our method to analyze ordinal and censored traits, since both types of traits are common in the evaluation of germplasm collections. Ordinal-probit and tobit models were employed to analyze ordinal and censored traits, respectively. In both models, we postulated the existence of a latent continuous variable associated with the observable data, and we used a Markov-chain Monte Carlo algorithm to sample the latent variable and determine the model parameters. We evaluated the efficiency of our approach by using simulated- and real-trait analyses of a rice germplasm collection. Simulation analyses based on real marker data showed that our models could reduce both false-positive and false-negative rates in detecting QTLs to reasonable levels. Simulation analyses based on highly polymorphic marker data, which were generated by coalescent simulations, showed that our models could be applied to genotype data based on highly polymorphic marker systems, like simple sequence repeats. For the real traits, we analyzed heading date as a censored trait and amylose content and the shape of milled rice grains as ordinal traits. We found significant markers that may be linked to previously reported QTLs. Our approach will be useful for whole-genome association mapping of ordinal and censored traits in rice germplasm collections.  相似文献   

4.
In risk assessment and environmental monitoring studies, concentration measurements frequently fall below detection limits (DL) of measuring instruments, resulting in left-censored data. The principal approaches for handling censored data include the substitution-based method, maximum likelihood estimation, robust regression on order statistics, and Kaplan-Meier. In practice, censored data are substituted with an arbitrary value prior to use of traditional statistical methods. Although some studies have evaluated the substitution performance in estimating population characteristics, they have focused mainly on normally and lognormally distributed data that contain a single DL. We employ Monte Carlo simulations to assess the impact of substitution when estimating population parameters based on censored data containing multiple DLs. We also consider different distributional assumptions including lognormal, Weibull, and gamma. We show that the reliability of the estimates after substitution is highly sensitive to distributional characteristics such as mean, standard deviation, skewness, and also data characteristics such as censoring percentage. The results highlight that although the performance of the substitution-based method improves as the censoring percentage decreases, its performance still depends on the population's distributional characteristics. Practical implications that follow from our findings indicate that caution must be taken in using the substitution method when analyzing censored environmental data.  相似文献   

5.
Variance-component methods are popular and flexible analytic tools for elucidating the genetic mechanisms of complex quantitative traits from pedigree data. However, variance-component methods typically assume that the trait of interest follows a multivariate normal distribution within a pedigree. Studies have shown that violation of this normality assumption can lead to biased parameter estimates and inflations in type-I error. This limits the application of variance-component methods to more general trait outcomes, whether continuous or categorical in nature. In this paper, we develop and apply a general variance-component framework for pedigree analysis of continuous and categorical outcomes. We develop appropriate models using generalized-linear mixed model theory and fit such models using approximate maximum-likelihood procedures. Using our proposed method, we demonstrate that one can perform variance-component pedigree analysis on outcomes that follow any exponential-family distribution. Additionally, we also show how one can modify the method to perform pedigree analysis of ordinal outcomes. We also discuss extensions of our variance-component framework to accommodate pedigrees ascertained based on trait outcome. We demonstrate the feasibility of our method using both simulated data and data from a genetic study of ovarian insufficiency.  相似文献   

6.
Effects of censoring on parameter estimates and power in genetic modeling.   总被引:5,自引:0,他引:5  
Genetic and environmental influences on variance in phenotypic traits may be estimated with normal theory Maximum Likelihood (ML). However, when the assumption of multivariate normality is not met, this method may result in biased parameter estimates and incorrect likelihood ratio tests. We simulated multivariate normal distributed twin data under the assumption of three different genetic models. Genetic model fitting was performed in six data sets: multivariate normal data, discrete uncensored data, censored data, square root transformed censored data, normal scores of censored data, and categorical data. Estimates were obtained with normal theory ML (data sets 1-5) and with categorical data analysis (data set 6). Statistical power was examined by fitting reduced models to the data. When fitting an ACE model to censored data, an unbiased estimate of the additive genetic effect was obtained. However, the common environmental effect was underestimated and the unique environmental effect was overestimated. Transformations did not remove this bias. When fitting an ADE model, the additive genetic effect was underestimated while the dominant and unique environmental effects were overestimated. In all models, the correct parameter estimates were recovered with categorical data analysis. However, with categorical data analysis, the statistical power decreased. The analysis of L-shaped distributed data with normal theory ML results in biased parameter estimates. Unbiased parameter estimates are obtained with categorical data analysis, but the power decreases.  相似文献   

7.
Most statistical methods for censored survival data assume there is no dependence between the lifetime and censoring mechanisms, an assumption which is often doubtful in practice. In this paper we study a parametric model which allows for dependence in terms of a parameter delta and a bias function B(t, theta). We propose a sensitivity analysis on the estimate of the parameter of interest for small values of delta. This parameter measures the dependence between the lifetime and the censoring mechanisms. Its size can be interpreted in terms of a correlation coefficient between the two mechanisms. A medical example suggests that even a small degree of dependence between the failure and censoring processes can have a noticeable effect on the analysis.  相似文献   

8.
This study examined the method of simultaneous estimation of recombination frequency and parameters for a qualitative trait locus and compared the results with those of standard methods of linkage analysis. With both approaches we were able to detect linkage of an incompletely penetrant qualitative trait to highly polymorphic markers with recombination frequencies in the range of .00-.05. Our results suggest that detecting linkage at larger recombination frequencies may require larger data sets or large high-density families. When applied to all families without regard to informativeness of the family structure for linkage, analyses of simulated data could detect no advantage of simultaneous estimation over more traditional and much less time-consuming methods, either in detecting linkage, estimating frequency, refining estimates of parameters for the qualitative trait locus, or avoiding false evidence for linkage. However, the method of sampling affected results.  相似文献   

9.
J A Hanley  M N Parnes 《Biometrics》1983,39(1):129-139
This paper presents examples of situations in which one wishes to estimate a multivariate distribution from data that may be right-censored. A distinction is made between what we term 'homogeneous' and 'heterogeneous' censoring. It is shown how a multivariate empirical survivor function must be constructed in order to be considered a (nonparametric) maximum likelihood estimate of the underlying survivor function. A closed-form solution, similar to the product-limit estimate of Kaplan and Meier, is possible with homogeneous censoring, but an iterative method, such as the EM algorithm, is required with heterogeneous censoring. An example is given in which an anomaly is produced if censored multivariate data are analyzed as a series of univariate variables; this anomaly is shown to disappear if the methods of this paper are used.  相似文献   

10.
The method of generalized pairwise comparisons (GPC) is an extension of the well-known nonparametric Wilcoxon–Mann–Whitney test for comparing two groups of observations. Multiple generalizations of Wilcoxon–Mann–Whitney test and other GPC methods have been proposed over the years to handle censored data. These methods apply different approaches to handling loss of information due to censoring: ignoring noninformative pairwise comparisons due to censoring (Gehan, Harrell, and Buyse); imputation using estimates of the survival distribution (Efron, Péron, and Latta); or inverse probability of censoring weighting (IPCW, Datta and Dong). Based on the GPC statistic, a measure of treatment effect, the “net benefit,” can be defined. It quantifies the difference between the probabilities that a randomly selected individual from one group is doing better than an individual from the other group. This paper aims at evaluating GPC methods for censored data, both in the context of hypothesis testing and estimation, and providing recommendations related to their choice in various situations. The methods that ignore uninformative pairs have comparable power to more complex and computationally demanding methods in situations of low censoring, and are slightly superior for high proportions (>40%) of censoring. If one is interested in estimation of the net benefit, Harrell's c index is an unbiased estimator if the proportional hazards assumption holds. Otherwise, the imputation (Efron or Peron) or IPCW (Datta, Dong) methods provide unbiased estimators in case of proportions of drop-out censoring up to 60%.  相似文献   

11.
This paper deals with a Cox proportional hazards regression model, where some covariates of interest are randomly right‐censored. While methods for censored outcomes have become ubiquitous in the literature, methods for censored covariates have thus far received little attention and, for the most part, dealt with the issue of limit‐of‐detection. For randomly censored covariates, an often‐used method is the inefficient complete‐case analysis (CCA) which consists in deleting censored observations in the data analysis. When censoring is not completely independent, the CCA leads to biased and spurious results. Methods for missing covariate data, including type I and type II covariate censoring as well as limit‐of‐detection do not readily apply due to the fundamentally different nature of randomly censored covariates. We develop a novel method for censored covariates using a conditional mean imputation based on either Kaplan–Meier estimates or a Cox proportional hazards model to estimate the effects of these covariates on a time‐to‐event outcome. We evaluate the performance of the proposed method through simulation studies and show that it provides good bias reduction and statistical efficiency. Finally, we illustrate the method using data from the Framingham Heart Study to assess the relationship between offspring and parental age of onset of cardiovascular events.  相似文献   

12.
Summary .  Recurrent event data analyses are usually conducted under the assumption that the censoring time is independent of the recurrent event process. In many applications the censoring time can be informative about the underlying recurrent event process, especially in situations where a correlated failure event could potentially terminate the observation of recurrent events. In this article, we consider a semiparametric model of recurrent event data that allows correlations between censoring times and recurrent event process via frailty. This flexible framework incorporates both time-dependent and time-independent covariates in the formulation, while leaving the distributions of frailty and censoring times unspecified. We propose a novel semiparametric inference procedure that depends on neither the frailty nor the censoring time distribution. Large sample properties of the regression parameter estimates and the estimated baseline cumulative intensity functions are studied. Numerical studies demonstrate that the proposed methodology performs well for realistic sample sizes. An analysis of hospitalization data for patients in an AIDS cohort study is presented to illustrate the proposed method.  相似文献   

13.
Randomized trials with dropouts or censored data and discrete time-to-event type outcomes are frequently analyzed using the Kaplan-Meier or product limit (PL) estimation method. However, the PL method assumes that the censoring mechanism is noninformative and when this assumption is violated, the inferences may not be valid. We propose an expanded PL method using a Bayesian framework to incorporate informative censoring mechanism and perform sensitivity analysis on estimates of the cumulative incidence curves. The expanded method uses a model, which can be viewed as a pattern mixture model, where odds for having an event during the follow-up interval $$({t}_{k-1},{t}_{k}]$$, conditional on being at risk at $${t}_{k-1}$$, differ across the patterns of missing data. The sensitivity parameters relate the odds of an event, between subjects from a missing-data pattern with the observed subjects for each interval. The large number of the sensitivity parameters is reduced by considering them as random and assumed to follow a log-normal distribution with prespecified mean and variance. Then we vary the mean and variance to explore sensitivity of inferences. The missing at random (MAR) mechanism is a special case of the expanded model, thus allowing exploration of the sensitivity to inferences as departures from the inferences under the MAR assumption. The proposed approach is applied to data from the TRial Of Preventing HYpertension.  相似文献   

14.
Field ornithologists have used traditional culture‐based techniques to determine the presence and abundance of microbes on surfaces such as eggshells, but culture‐independent PCR‐based methods have recently been introduced. We compared the traditional culture‐based and the real‐time PCR‐based methods for detecting and quantifying Escherichia coli on the eggshells of Eurasian Magpies (Pica pica). PCR estimates of bacterial abundance were ~10 times higher than culture‐based estimates, and the culture‐based technique failed to detect bacteria at lower densities. When both methods detected bacteria, bacterial densities determined by the two methods were positively correlated, indicating that both methods can be used to study factors affecting bacterial densities. The difference between the two methods is consistent with generally acknowledged higher sensitivity of the PCR method, but the extent of the difference in our study (10×) may have been influenced by both a PCR‐based overestimation and culture‐based underestimation of bacterial densities. Our results also illustrate that bacterial counts may sometimes produce left‐censored data (i.e., we did not detect E. coli in 62% of our samples using the culture‐based method). Specific statistical methods have been developed for analyzed left‐censored data, but, to our knowledge, have not been used by ornithologists. In future studies, investigators studying bacterial loads should provide information about the possible degree of left censoring and should justify their choice of statistical methods from the broad set of available methods, including those explicitly designed for censored data.  相似文献   

15.
Mandel M  Betensky RA 《Biometrics》2007,63(2):405-412
Several goodness-of-fit tests of a lifetime distribution have been suggested in the literature; many take into account censoring and/or truncation of event times. In some contexts, a goodness-of-fit test for the truncation distribution is of interest. In particular, better estimates of the lifetime distribution can be obtained when knowledge of the truncation law is exploited. In cross-sectional sampling, for example, there are theoretical justifications for the assumption of a uniform truncation distribution, and several studies have used it to improve the efficiency of their survival estimates. The duality of lifetime and truncation in the absence of censoring enables methods for testing goodness of fit of the lifetime distribution to be used for testing goodness of fit of the truncation distribution. However, under random censoring, this duality does not hold and different tests are required. In this article, we introduce several goodness-of-fit tests for the truncation distribution and investigate their performance in the presence of censored event times using simulation. We demonstrate the use of our tests on two data sets.  相似文献   

16.
Anderson CA  McRae AF  Visscher PM 《Genetics》2006,173(3):1735-1745
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.  相似文献   

17.
A standard multivariate principal components (PCs) method was utilized to identify clusters of variables that may be controlled by a common gene or genes (pleiotropy). Heritability estimates were obtained and linkage analyses performed on six individual traits (total cholesterol (Chol), high and low density lipoproteins, triglycerides (TG), body mass index (BMI), and systolic blood pressure (SBP)) and on each PC to compare our ability to identify major gene effects. Using the simulated data from Genetic Analysis Workshop 13 (Cohort 1 and 2 data for year 11), the quantitative traits were first adjusted for age, sex, and smoking (cigarettes per day). Adjusted variables were standardized and PCs calculated followed by orthogonal transformation (varimax rotation). Rotated PCs were then subjected to heritability and quantitative multipoint linkage analysis. The first three PCs explained 73% of the total phenotypic variance. Heritability estimates were above 0.60 for all three PCs. We performed linkage analyses on the PCs as well as the individual traits. The majority of pleiotropic and trait-specific genes were not identified. Standard PCs analysis methods did not facilitate the identification of pleiotropic genes affecting the six traits examined in the simulated data set. In addition, genes contributing 20% of the variance in traits with over 0.60 heritability estimates could not be identified in this simulated data set using traditional quantitative trait linkage analyses. Lack of identification of pleiotropic and trait-specific genes in some cases may reflect their low contribution to the traits/PCs examined or more importantly, characteristics of the sample group analyzed, and not simply a failure of the PC approach itself.  相似文献   

18.
Association studies of quantitative traits have often relied on methods in which a normal distribution of the trait is assumed. However, quantitative phenotypes from complex human diseases are often censored, highly skewed, or contaminated with outlying values. We recently developed a rank-based association method that takes into account censoring and makes no distributional assumptions about the trait. In this study, we applied our new method to age-at-onset data on ALDX1 and ALDX2. Both traits are highly skewed (skewness > 1.9) and often censored. We performed a whole genome association study of age at onset of the ALDX1 trait using Illumina single-nucleotide polymorphisms. Only slightly more than 5% of markers were significant. However, we identified two regions on chromosomes 14 and 15, which each have at least four significant markers clustering together. These two regions may harbor genes that regulate age at onset of ALDX1 and ALDX2. Future fine mapping of these two regions with densely spaced markers is warranted.  相似文献   

19.
Huang X  Zhang N 《Biometrics》2008,64(4):1090-1099
SUMMARY: In clinical studies, when censoring is caused by competing risks or patient withdrawal, there is always a concern about the validity of treatment effect estimates that are obtained under the assumption of independent censoring. Because dependent censoring is nonidentifiable without additional information, the best we can do is a sensitivity analysis to assess the changes of parameter estimates under different assumptions about the association between failure and censoring. This analysis is especially useful when knowledge about such association is available through literature review or expert opinions. In a regression analysis setting, the consequences of falsely assuming independent censoring on parameter estimates are not clear. Neither the direction nor the magnitude of the potential bias can be easily predicted. We provide an approach to do sensitivity analysis for the widely used Cox proportional hazards models. The joint distribution of the failure and censoring times is assumed to be a function of their marginal distributions. This function is called a copula. Under this assumption, we propose an iteration algorithm to estimate the regression parameters and marginal survival functions. Simulation studies show that this algorithm works well. We apply the proposed sensitivity analysis approach to the data from an AIDS clinical trial in which 27% of the patients withdrew due to toxicity or at the request of the patient or investigator.  相似文献   

20.
Path analysis is one of several methods available for quantitative genetic analysis, providing for both tests of hypotheses and estimates of relevant parameters. Central to the theory is the assumption that the observations follow a multivariate normal distribution within families. The purpose of the present investigation is to assess the effects of a certain type of departures from multivariate normality using quantitative family data on lipid and lipoprotein levels. The results show that even large departures produce reasonably unbiased parameter estimates. Whereas moderate departures lead to few inferential errors in hypothesis testing, gross departures from multivariate normality may have considerable effects on likelihood ratio tests.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号