首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 30 毫秒
1.
This paper deals with a Cox proportional hazards regression model, where some covariates of interest are randomly right‐censored. While methods for censored outcomes have become ubiquitous in the literature, methods for censored covariates have thus far received little attention and, for the most part, dealt with the issue of limit‐of‐detection. For randomly censored covariates, an often‐used method is the inefficient complete‐case analysis (CCA) which consists in deleting censored observations in the data analysis. When censoring is not completely independent, the CCA leads to biased and spurious results. Methods for missing covariate data, including type I and type II covariate censoring as well as limit‐of‐detection do not readily apply due to the fundamentally different nature of randomly censored covariates. We develop a novel method for censored covariates using a conditional mean imputation based on either Kaplan–Meier estimates or a Cox proportional hazards model to estimate the effects of these covariates on a time‐to‐event outcome. We evaluate the performance of the proposed method through simulation studies and show that it provides good bias reduction and statistical efficiency. Finally, we illustrate the method using data from the Framingham Heart Study to assess the relationship between offspring and parental age of onset of cardiovascular events.  相似文献   

2.
Summary Dietary assessment of episodically consumed foods gives rise to nonnegative data that have excess zeros and measurement error. Tooze et al. (2006, Journal of the American Dietetic Association 106 , 1575–1587) describe a general statistical approach (National Cancer Institute method) for modeling such food intakes reported on two or more 24‐hour recalls (24HRs) and demonstrate its use to estimate the distribution of the food's usual intake in the general population. In this article, we propose an extension of this method to predict individual usual intake of such foods and to evaluate the relationships of usual intakes with health outcomes. Following the regression calibration approach for measurement error correction, individual usual intake is generally predicted as the conditional mean intake given 24HR‐reported intake and other covariates in the health model. One feature of the proposed method is that additional covariates potentially related to usual intake may be used to increase the precision of estimates of usual intake and of diet‐health outcome associations. Applying the method to data from the Eating at America's Table Study, we quantify the increased precision obtained from including reported frequency of intake on a food frequency questionnaire (FFQ) as a covariate in the calibration model. We then demonstrate the method in evaluating the linear relationship between log blood mercury levels and fish intake in women by using data from the National Health and Nutrition Examination Survey, and show increased precision when including the FFQ information. Finally, we present simulation results evaluating the performance of the proposed method in this context.  相似文献   

3.
Multiple imputation (MI) is increasingly popular for handling multivariate missing data. Two general approaches are available in standard computer packages: MI based on the posterior distribution of incomplete variables under a multivariate (joint) model, and fully conditional specification (FCS), which imputes missing values using univariate conditional distributions for each incomplete variable given all the others, cycling iteratively through the univariate imputation models. In the context of longitudinal or clustered data, it is not clear whether these approaches result in consistent estimates of regression coefficient and variance component parameters when the analysis model of interest is a linear mixed effects model (LMM) that includes both random intercepts and slopes with either covariates or both covariates and outcome contain missing information. In the current paper, we compared the performance of seven different MI methods for handling missing values in longitudinal and clustered data in the context of fitting LMMs with both random intercepts and slopes. We study the theoretical compatibility between specific imputation models fitted under each of these approaches and the LMM, and also conduct simulation studies in both the longitudinal and clustered data settings. Simulations were motivated by analyses of the association between body mass index (BMI) and quality of life (QoL) in the Longitudinal Study of Australian Children (LSAC). Our findings showed that the relative performance of MI methods vary according to whether the incomplete covariate has fixed or random effects and whether there is missingnesss in the outcome variable. We showed that compatible imputation and analysis models resulted in consistent estimation of both regression parameters and variance components via simulation. We illustrate our findings with the analysis of LSAC data.  相似文献   

4.
Summary Time varying, individual covariates are problematic in experiments with marked animals because the covariate can typically only be observed when each animal is captured. We examine three methods to incorporate time varying, individual covariates of the survival probabilities into the analysis of data from mark‐recapture‐recovery experiments: deterministic imputation, a Bayesian imputation approach based on modeling the joint distribution of the covariate and the capture history, and a conditional approach considering only the events for which the associated covariate data are completely observed (the trinomial model). After describing the three methods, we compare results from their application to the analysis of the effect of body mass on the survival of Soay sheep (Ovis aries) on the Isle of Hirta, Scotland. Simulations based on these results are then used to make further comparisons. We conclude that both the trinomial model and Bayesian imputation method perform best in different situations. If the capture and recovery probabilities are all high, then the trinomial model produces precise, unbiased estimators that do not depend on any assumptions regarding the distribution of the covariate. In contrast, the Bayesian imputation method performs substantially better when capture and recovery probabilities are low, provided that the specified model of the covariate is a good approximation to the true data‐generating mechanism.  相似文献   

5.
Our present work proposes a new survival model in a Bayesian context to analyze right‐censored survival data for populations with a surviving fraction, assuming that the log failure time follows a generalized extreme value distribution. Many applications require a more flexible modeling of covariate information than a simple linear or parametric form for all covariate effects. It is also necessary to include the spatial variation in the model, since it is sometimes unexplained by the covariates considered in the analysis. Therefore, the nonlinear covariate effects and the spatial effects are incorporated into the systematic component of our model. Gaussian processes (GPs) provide a natural framework for modeling potentially nonlinear relationship and have recently become extremely powerful in nonlinear regression. Our proposed model adopts a semiparametric Bayesian approach by imposing a GP prior on the nonlinear structure of continuous covariate. With the consideration of data availability and computational complexity, the conditionally autoregressive distribution is placed on the region‐specific frailties to handle spatial correlation. The flexibility and gains of our proposed model are illustrated through analyses of simulated data examples as well as a dataset involving a colon cancer clinical trial from the state of Iowa.  相似文献   

6.
Liang Li  Bo Hu  Tom Greene 《Biometrics》2009,65(3):737-745
Summary .  In many longitudinal clinical studies, the level and progression rate of repeatedly measured biomarkers on each subject quantify the severity of the disease and that subject's susceptibility to progression of the disease. It is of scientific and clinical interest to relate such quantities to a later time-to-event clinical endpoint such as patient survival. This is usually done with a shared parameter model. In such models, the longitudinal biomarker data and the survival outcome of each subject are assumed to be conditionally independent given subject-level severity or susceptibility (also called frailty in statistical terms). In this article, we study the case where the conditional distribution of longitudinal data is modeled by a linear mixed-effect model, and the conditional distribution of the survival data is given by a Cox proportional hazard model. We allow unknown regression coefficients and time-dependent covariates in both models. The proposed estimators are maximizers of an exact correction to the joint log likelihood with the frailties eliminated as nuisance parameters, an idea that originated from correction of covariate measurement error in measurement error models. The corrected joint log likelihood is shown to be asymptotically concave and leads to consistent and asymptotically normal estimators. Unlike most published methods for joint modeling, the proposed estimation procedure does not rely on distributional assumptions of the frailties. The proposed method was studied in simulations and applied to a data set from the Hemodialysis Study.  相似文献   

7.
Andreas Lindén  Jonas Knape 《Oikos》2009,118(5):675-680
Within the paradigm of population dynamics a central task is to identify environmental factors affecting population change and to estimate the strength of these effects. We here investigate the impact of observation errors in measurements of population densities on estimates of environmental effects. Adding observation errors may change the autocorrelation of a population time series with potential consequences for estimates of effects of autocorrelated environmental covariates. Using Monte Carlo simulations, we compare the performance of maximum likelihood estimates from three stochastic versions of the Gompertz model (log–linear first order autoregressive model), assuming 1) process error only, 2) observation error only, and 3) both process and observation error (the linear state–space model on log‐scale). We also simulated population dynamics using the Ricker model, and evaluated the corresponding maximum likelihood estimates for process error models. When there is observation error in the data and the considered environmental variable is strongly autocorrelated, its estimated effect is likely to be biased when using process error models. The environmental effect is overestimated when the sign of the autocorrelations of the intrinsic dynamics and the environment are the same and underestimated when the signs differ. With non‐autocorrelated environmental covariates, process error models produce fairly exact point estimates as well as reliable confidence intervals for environmental effects. In all scenarios, observation error models produce unbiased estimates with reasonable precision, but confidence intervals derived from the likelihood profiles are far too optimistic if there is process error present. The safest approach is to use state–space models in presence of observation error. These are factors worthwhile to consider when interpreting earlier empirical results on population time series, and in future studies, we recommend choosing carefully the modelling approach with respect to intrinsic population dynamics and covariate autocorrelation.  相似文献   

8.
9.
Maps depicting cancer incidence rates have become useful tools in public health research, giving valuable information about the spatial variation in rates of disease. Typically, these maps are generated using count data aggregated over areas such as counties or census blocks. However, with the proliferation of geographic information systems and related databases, it is becoming easier to obtain exact spatial locations for the cancer cases and suitable control subjects. The use of such point data allows us to adjust for individual-level covariates, such as age and smoking status, when estimating the spatial variation in disease risk. Unfortunately, such covariate information is often subject to missingness. We propose a method for mapping cancer risk when covariates are not completely observed. We model these data using a logistic generalized additive model. Estimates of the linear and non-linear effects are obtained using a mixed effects model representation. We develop an EM algorithm to account for missing data and the random effects. Since the expectation step involves an intractable integral, we estimate the E-step with a Laplace approximation. This framework provides a general method for handling missing covariate values when fitting generalized additive models. We illustrate our method through an analysis of cancer incidence data from Cape Cod, Massachusetts. These analyses demonstrate that standard complete-case methods can yield biased estimates of the spatial variation of cancer risk.  相似文献   

10.
In order to study family‐based association in the presence of linkage, we extend a generalized linear mixed model proposed for genetic linkage analysis (Lebrec and van Houwelingen (2007), Human Heredity 64 , 5–15) by adding a genotypic effect to the mean. The corresponding score test is a weighted family‐based association tests statistic, where the weight depends on the linkage effect and on other genetic and shared environmental effects. For testing of genetic association in the presence of gene–covariate interaction, we propose a linear regression method where the family‐specific score statistic is regressed on family‐specific covariates. Both statistics are straightforward to compute. Simulation results show that adjusting the weight for the within‐family variance structure may be a powerful approach in the presence of environmental effects. The test statistic for genetic association in the presence of gene–covariate interaction improved the power for detecting association. For illustration, we analyze the rheumatoid arthritis data from GAW15. Adjusting for smoking and anti‐cyclic citrullinated peptide increased the significance of the association with the DR locus.  相似文献   

11.
D P Byar  N Mantel 《Biometrics》1975,31(4):943-947
Interrelationships among three response-time models which incorporate covariate information are explored. The most general of these models is the logistic-exponential in which the log odds of the probability of responding in a fixed interval is assumed to be a linear function of the covariates; this model includes a parameter W for the width of discrete time intervals in which responses occur. As W leads to O this model is equivalent to a continuous time exponential model in which the log hazard is linear in the covariates. As W leads to infininity it is equivalent to a continuous time exponential model in which the hazard itself is a linear function of the covariates. This second model was fitted to the data used in an earlier publication describing the logistic exponential model, and very close agreement of the estimates of the regression coefficients is demonstrated.  相似文献   

12.
We consider longitudinal studies in which the outcome observed over time is binary and the covariates of interest are categorical. With no missing responses or covariates, one specifies a multinomial model for the responses given the covariates and uses maximum likelihood to estimate the parameters. Unfortunately, incomplete data in the responses and covariates are a common occurrence in longitudinal studies. Here we assume the missing data are missing at random (Rubin, 1976, Biometrika 63, 581-592). Since all of the missing data (responses and covariates) are categorical, a useful technique for obtaining maximum likelihood parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). In using the EM algorithm with missing responses and covariates, one specifies the joint distribution of the responses and covariates. Here we consider the parameters of the covariate distribution as a nuisance. In data sets where the percentage of missing data is high, the estimates of the nuisance parameters can lead to highly unstable estimates of the parameters of interest. We propose a conditional model for the covariate distribution that has several modeling advantages for the EM algorithm and provides a reduction in the number of nuisance parameters, thus providing more stable estimates in finite samples.  相似文献   

13.
In many studies, the association of longitudinal measurements of a continuous response and a binary outcome are often of interest. A convenient framework for this type of problems is the joint model, which is formulated to investigate the association between a binary outcome and features of longitudinal measurements through a common set of latent random effects. The joint model, which is the focus of this article, is a logistic regression model with covariates defined as the individual‐specific random effects in a non‐linear mixed‐effects model (NLMEM) for the longitudinal measurements. We discuss different estimation procedures, which include two‐stage, best linear unbiased predictors, and various numerical integration techniques. The proposed methods are illustrated using a real data set where the objective is to study the association between longitudinal hormone levels and the pregnancy outcome in a group of young women. The numerical performance of the estimating methods is also evaluated by means of simulation.  相似文献   

14.
Group randomized trials (GRTs) randomize groups, or clusters, of people to intervention or control arms. To test for the effectiveness of the intervention when subject‐level outcomes are binary, and while fitting a marginal model that adjusts for cluster‐level covariates and utilizes a logistic link, we develop a pseudo‐Wald statistic to improve inference. Alternative Wald statistics could employ bias‐corrected empirical sandwich standard error estimates, which have received limited attention in the GRT literature despite their broad utility and applicability in our settings of interest. The test could also be carried out using popular approaches based upon cluster‐level summary outcomes. A simulation study covering a variety of realistic GRT settings is used to compare the accuracy of these methods in terms of producing nominal test sizes. Tests based upon the pseudo‐Wald statistic and a cluster‐level summary approach utilizing the natural log of observed cluster‐level odds worked best. Due to weighting, some popular cluster‐level summary approaches were found to lead to invalid inference in many settings. Finally, although use of bias‐corrected empirical sandwich standard error estimates did not consistently result in nominal sizes, they did work well, thus supporting the applicability of marginal models in GRT settings.  相似文献   

15.
Several diseases have common risk factors. The joint modeling of disease outcomes within a spatial statistical context may provide more insight on the interaction of diseases both at individual and at regional level. Spatial joint modeling allows for studying of the relationship between diseases and also between regions under study. One major approach for joint spatial modeling is the multivariate conditional autoregressive approach. In this approach, it is assumed that all the covariates in the study have linear effects on the multiple response variables. In this study, we relax this linearity assumption and allow some covariates to have nonlinear effects using the penalized regression splines. This model was used to jointly model the spatial variation of human immunodeficiency virus (HIV) and herpes simplex virus-type 2 (HSV-2) among women in Kenya. The model was applied to HIV and HSV-2 prevalence data among women aged 15–49 years in Kenya, derived from the 2007 Kenya AIDS indicator survey. A full Bayesian approach was used and the models were implemented in WinBUGS software. Both diseases showed significant spatial variation with highest disease burdens occurring around the Lake Victoria region. There was a nonlinear association between age of an individual and HIV and HSV-2 infection. The peak age for HIV was around 30 years while that of HSV-2 was about 40 years. A positive significant spatial correlation between HIV and HSV-2 was observed with a correlation of 0.6831(95% CI: 0.3859, 0.871).  相似文献   

16.
Summary A time‐specific log‐linear regression method on quantile residual lifetime is proposed. Under the proposed regression model, any quantile of a time‐to‐event distribution among survivors beyond a certain time point is associated with selected covariates under right censoring. Consistency and asymptotic normality of the regression estimator are established. An asymptotic test statistic is proposed to evaluate the covariate effects on the quantile residual lifetimes at a specific time point. Evaluation of the test statistic does not require estimation of the variance–covariance matrix of the regression estimators, which involves the probability density function of the survival distribution with censoring. Simulation studies are performed to assess finite sample properties of the regression parameter estimator and test statistic. The new regression method is applied to a breast cancer data set with long‐term follow‐up to estimate the patients' median residual lifetimes, adjusting for important prognostic factors.  相似文献   

17.
The power variance function distributions, which include the gamma and compound Poisson (CP) distributions among others, are commonly used in frailty models for family data. In a previous paper, we presented a frailty model constructed by randomizing the scale parameter in a CP distribution. When combined with a parametric baseline hazard, this yields a model with heterogeneity on both the individual and the family level and a subgroup with zero frailty, corresponding to people not experiencing the event. In this paper, we discuss covariates in the model. Depending on where the covariates are inserted in the model, one may have proportional hazards at the individual level, the family level, and a larger group level (for covariates shared by many families, e.g. ethnic groups) or get accelerated failure times. Each of these alternatives gives a specific interpretation of the covariate effects. An application to data infant mortality in siblings from the Medical Birth Registry of Norway is included. We compare the results for some of the different covariate modeling options.  相似文献   

18.
Summary In this article, we propose a positive stable shared frailty Cox model for clustered failure time data where the frailty distribution varies with cluster‐level covariates. The proposed model accounts for covariate‐dependent intracluster correlation and permits both conditional and marginal inferences. We obtain marginal inference directly from a marginal model, then use a stratified Cox‐type pseudo‐partial likelihood approach to estimate the regression coefficient for the frailty parameter. The proposed estimators are consistent and asymptotically normal and a consistent estimator of the covariance matrix is provided. Simulation studies show that the proposed estimation procedure is appropriate for practical use with a realistic number of clusters. Finally, we present an application of the proposed method to kidney transplantation data from the Scientific Registry of Transplant Recipients.  相似文献   

19.
In many longitudinal studies, it is of interest to characterize the relationship between a time-to-event (e.g. survival) and several time-dependent and time-independent covariates. Time-dependent covariates are generally observed intermittently and with error. For a single time-dependent covariate, a popular approach is to assume a joint longitudinal data-survival model, where the time-dependent covariate follows a linear mixed effects model and the hazard of failure depends on random effects and time-independent covariates via a proportional hazards relationship. Regression calibration and likelihood or Bayesian methods have been advocated for implementation; however, generalization to more than one time-dependent covariate may become prohibitive. For a single time-dependent covariate, Tsiatis and Davidian (2001) have proposed an approach that is easily implemented and does not require an assumption on the distribution of the random effects. This technique may be generalized to multiple, possibly correlated, time-dependent covariates, as we demonstrate. We illustrate the approach via simulation and by application to data from an HIV clinical trial.  相似文献   

20.
Horton NJ  Laird NM 《Biometrics》2001,57(1):34-42
This article presents a new method for maximum likelihood estimation of logistic regression models with incomplete covariate data where auxiliary information is available. This auxiliary information is extraneous to the regression model of interest but predictive of the covariate with missing data. Ibrahim (1990, Journal of the American Statistical Association 85, 765-769) provides a general method for estimating generalized linear regression models with missing covariates using the EM algorithm that is easily implemented when there is no auxiliary data. Vach (1997, Statistics in Medicine 16, 57-72) describes how the method can be extended when the outcome and auxiliary data are conditionally independent given the covariates in the model. The method allows the incorporation of auxiliary data without making the conditional independence assumption. We suggest tests of conditional independence and compare the performance of several estimators in an example concerning mental health service utilization in children. Using an artificial dataset, we compare the performance of several estimators when auxiliary data are available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号