首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 36 毫秒
1.
In cohort studies the outcome is often time to a particular event, and subjects are followed at regular intervals. Periodic visits may also monitor a secondary irreversible event influencing the event of primary interest, and a significant proportion of subjects develop the secondary event over the period of follow‐up. The status of the secondary event serves as a time‐varying covariate, but is recorded only at the times of the scheduled visits, generating incomplete time‐varying covariates. While information on a typical time‐varying covariate is missing for entire follow‐up period except the visiting times, the status of the secondary event are unavailable only between visits where the status has changed, thus interval‐censored. One may view interval‐censored covariate of the secondary event status as missing time‐varying covariates, yet missingness is partial since partial information is provided throughout the follow‐up period. Current practice of using the latest observed status produces biased estimators, and the existing missing covariate techniques cannot accommodate the special feature of missingness due to interval censoring. To handle interval‐censored covariates in the Cox proportional hazards model, we propose an available‐data estimator, a doubly robust‐type estimator as well as the maximum likelihood estimator via EM algorithm and present their asymptotic properties. We also present practical approaches that are valid. We demonstrate the proposed methods using our motivating example from the Northern Manhattan Study.  相似文献   

2.
We consider longitudinal studies in which the outcome observed over time is binary and the covariates of interest are categorical. With no missing responses or covariates, one specifies a multinomial model for the responses given the covariates and uses maximum likelihood to estimate the parameters. Unfortunately, incomplete data in the responses and covariates are a common occurrence in longitudinal studies. Here we assume the missing data are missing at random (Rubin, 1976, Biometrika 63, 581-592). Since all of the missing data (responses and covariates) are categorical, a useful technique for obtaining maximum likelihood parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). In using the EM algorithm with missing responses and covariates, one specifies the joint distribution of the responses and covariates. Here we consider the parameters of the covariate distribution as a nuisance. In data sets where the percentage of missing data is high, the estimates of the nuisance parameters can lead to highly unstable estimates of the parameters of interest. We propose a conditional model for the covariate distribution that has several modeling advantages for the EM algorithm and provides a reduction in the number of nuisance parameters, thus providing more stable estimates in finite samples.  相似文献   

3.
Zexi Cai  Tony Sit 《Biometrics》2020,76(4):1201-1215
Quantile regression is a flexible and effective tool for modeling survival data and its relationship with important covariates, which often vary over time. Informative right censoring of data from the prevalent cohort within the population often results in length-biased observations. We propose an estimating equation-based approach to obtain consistent estimators of the regression coefficients of interest based on length-biased observations with time-dependent covariates. In addition, inspired by Zeng and Lin 2008, we also develop a more numerically stable procedure for variance estimation. Large sample properties including consistency and asymptotic normality of the proposed estimator are established. Numerical studies presented demonstrate convincing performance of the proposed estimator under various settings. The application of the proposed method is demonstrated using the Oscar dataset.  相似文献   

4.
Case-cohort analysis with accelerated failure time model   总被引:1,自引:0,他引:1  
Kong L  Cai J 《Biometrics》2009,65(1):135-142
Summary .  In a case–cohort design, covariates are assembled only for a subcohort that is randomly selected from the entire cohort and any additional cases outside the subcohort. This design is appealing for large cohort studies of rare disease, especially when the exposures of interest are expensive to ascertain for all the subjects. We propose statistical methods for analyzing the case–cohort data with a semiparametric accelerated failure time model that interprets the covariates effects as to accelerate or decelerate the time to failure. Asymptotic properties of the proposed estimators are developed. The finite sample properties of case–cohort estimator and its relative efficiency to full cohort estimator are assessed via simulation studies. A real example from a study of cardiovascular disease is provided to illustrate the estimating procedure.  相似文献   

5.
Complex diseases like cancers can often be classified into subtypes using various pathological and molecular traits of the disease. In this article, we develop methods for analysis of disease incidence in cohort studies incorporating data on multiple disease traits using a two-stage semiparametric Cox proportional hazards regression model that allows one to examine the heterogeneity in the effect of the covariates by the levels of the different disease traits. For inference in the presence of missing disease traits, we propose a generalization of an estimating equation approach for handling missing cause of failure in competing-risk data. We prove asymptotic unbiasedness of the estimating equation method under a general missing-at-random assumption and propose a novel influence-function-based sandwich variance estimator. The methods are illustrated using simulation studies and a real data application involving the Cancer Prevention Study II nutrition cohort.  相似文献   

6.
Imputation, weighting, direct likelihood, and direct Bayesian inference (Rubin, 1976) are important approaches for missing data regression. Many useful semiparametric estimators have been developed for regression analysis of data with missing covariates or outcomes. It has been established that some semiparametric estimators are asymptotically equivalent, but it has not been shown that many are numerically the same. We applied some existing methods to a bladder cancer case-control study and noted that they were the same numerically when the observed covariates and outcomes are categorical. To understand the analytical background of this finding, we further show that when observed covariates and outcomes are categorical, some estimators are not only asymptotically equivalent but also actually numerically identical. That is, although their estimating equations are different, they lead numerically to exactly the same root. This includes a simple weighted estimator, an augmented weighted estimator, and a mean-score estimator. The numerical equivalence may elucidate the relationship between imputing scores and weighted estimation procedures.  相似文献   

7.
Summary Nested case–control (NCC) design is a popular sampling method in large epidemiological studies for its cost effectiveness to investigate the temporal relationship of diseases with environmental exposures or biological precursors. Thomas' maximum partial likelihood estimator is commonly used to estimate the regression parameters in Cox's model for NCC data. In this article, we consider a situation in which failure/censoring information and some crude covariates are available for the entire cohort in addition to NCC data and propose an improved estimator that is asymptotically more efficient than Thomas' estimator. We adopt a projection approach that, heretofore, has only been employed in situations of random validation sampling and show that it can be well adapted to NCC designs where the sampling scheme is a dynamic process and is not independent for controls. Under certain conditions, consistency and asymptotic normality of the proposed estimator are established and a consistent variance estimator is also developed. Furthermore, a simplified approximate estimator is proposed when the disease is rare. Extensive simulations are conducted to evaluate the finite sample performance of our proposed estimators and to compare the efficiency with Thomas' estimator and other competing estimators. Moreover, sensitivity analyses are conducted to demonstrate the behavior of the proposed estimator when model assumptions are violated, and we find that the biases are reasonably small in realistic situations. We further demonstrate the proposed method with data from studies on Wilms' tumor.  相似文献   

8.
Summary In this article, we propose a positive stable shared frailty Cox model for clustered failure time data where the frailty distribution varies with cluster‐level covariates. The proposed model accounts for covariate‐dependent intracluster correlation and permits both conditional and marginal inferences. We obtain marginal inference directly from a marginal model, then use a stratified Cox‐type pseudo‐partial likelihood approach to estimate the regression coefficient for the frailty parameter. The proposed estimators are consistent and asymptotically normal and a consistent estimator of the covariance matrix is provided. Simulation studies show that the proposed estimation procedure is appropriate for practical use with a realistic number of clusters. Finally, we present an application of the proposed method to kidney transplantation data from the Scientific Registry of Transplant Recipients.  相似文献   

9.
Song X  Wang CY 《Biometrics》2008,64(2):557-566
Summary .   We study joint modeling of survival and longitudinal data. There are two regression models of interest. The primary model is for survival outcomes, which are assumed to follow a time-varying coefficient proportional hazards model. The second model is for longitudinal data, which are assumed to follow a random effects model. Based on the trajectory of a subject's longitudinal data, some covariates in the survival model are functions of the unobserved random effects. Estimated random effects are generally different from the unobserved random effects and hence this leads to covariate measurement error. To deal with covariate measurement error, we propose a local corrected score estimator and a local conditional score estimator. Both approaches are semiparametric methods in the sense that there is no distributional assumption needed for the underlying true covariates. The estimators are shown to be consistent and asymptotically normal. However, simulation studies indicate that the conditional score estimator outperforms the corrected score estimator for finite samples, especially in the case of relatively large measurement error. The approaches are demonstrated by an application to data from an HIV clinical trial.  相似文献   

10.
GOETGHEBEUR  ELS; RYAN  LOUISE 《Biometrika》1995,82(4):821-833
We propose a method to analyse competing risks survival datawhen failure types are missing for some individuals. Our approachis based on a standard proportional hazards structure for eachof the failure types, and involves the solution to estimatingequations. We present consistent and asymptotically normal estimatorsof the regression coefficients and related score tests. An appealingfeature is that individuals with known failure types make thesame contributions as they would to a standard proportionalhazards analysis. Contributions of individuals with unknownfailure types are weighted according to the probability thatthey failed from the cause of interest. Efficiency and robustnessare discussed. Results are illustrated with data from a breastcancer trial.  相似文献   

11.
Methods in the literature for missing covariate data in survival models have relied on the missing at random (MAR) assumption to render regression parameters identifiable. MAR means that missingness can depend on the observed exit time, and whether or not that exit is a failure or a censoring event. By considering ways in which missingness of covariate X could depend on the true but possibly censored failure time T and the true censoring time C, we attempt to identify missingness mechanisms which would yield MAR data. We find that, under various reasonable assumptions about how missingness might depend on T and/or C, additional strong assumptions are needed to obtain MAR. We conclude that MAR is difficult to justify in practical applications. One exception arises when missingness is independent of T, and C is independent of the value of the missing X. As alternatives to MAR, we propose two new missingness assumptions. In one, the missingness depends on T but not on C; in the other, the situation is reversed. For each, we show that the failure time model is identifiable. When missingness is independent of T, we show that the naive complete record analysis will yield a consistent estimator of the failure time distribution. When missingness is independent of C, we develop a complete record likelihood function and a corresponding estimator for parametric failure time models. We propose analyses to evaluate the plausibility of either assumption in a particular data set, and illustrate the ideas using data from the literature on this problem.  相似文献   

12.
Multiple imputation (MI) is increasingly popular for handling multivariate missing data. Two general approaches are available in standard computer packages: MI based on the posterior distribution of incomplete variables under a multivariate (joint) model, and fully conditional specification (FCS), which imputes missing values using univariate conditional distributions for each incomplete variable given all the others, cycling iteratively through the univariate imputation models. In the context of longitudinal or clustered data, it is not clear whether these approaches result in consistent estimates of regression coefficient and variance component parameters when the analysis model of interest is a linear mixed effects model (LMM) that includes both random intercepts and slopes with either covariates or both covariates and outcome contain missing information. In the current paper, we compared the performance of seven different MI methods for handling missing values in longitudinal and clustered data in the context of fitting LMMs with both random intercepts and slopes. We study the theoretical compatibility between specific imputation models fitted under each of these approaches and the LMM, and also conduct simulation studies in both the longitudinal and clustered data settings. Simulations were motivated by analyses of the association between body mass index (BMI) and quality of life (QoL) in the Longitudinal Study of Australian Children (LSAC). Our findings showed that the relative performance of MI methods vary according to whether the incomplete covariate has fixed or random effects and whether there is missingnesss in the outcome variable. We showed that compatible imputation and analysis models resulted in consistent estimation of both regression parameters and variance components via simulation. We illustrate our findings with the analysis of LSAC data.  相似文献   

13.
In some large clinical studies, it may be impractical to perform the physical examination to every subject at his/her last monitoring time in order to diagnose the occurrence of the event of interest. This gives rise to survival data with missing censoring indicators where the probability of missing may depend on time of last monitoring and some covariates. We present a fully Bayesian semi‐parametric method for such survival data to estimate regression parameters of the proportional hazards model of Cox. Theoretical investigation and simulation studies show that our method performs better than competing methods. We apply the proposed method to analyze the survival data with missing censoring indicators from the Orofacial Pain: Prospective Evaluation and Risk Assessment study.  相似文献   

14.
For regression with covariates missing not at random where the missingness depends on the missing covariate values, complete-case (CC) analysis leads to consistent estimation when the missingness is independent of the response given all covariates, but it may not have the desired level of efficiency. We propose a general empirical likelihood framework to improve estimation efficiency over the CC analysis. We expand on methods in Bartlett et al. (2014, Biostatistics 15 , 719–730) and Xie and Zhang (2017, Int J Biostat 13 , 1–20) that improve efficiency by modeling the missingness probability conditional on the response and fully observed covariates by allowing the possibility of modeling other data distribution-related quantities. We also give guidelines on what quantities to model and demonstrate that our proposal has the potential to yield smaller biases than existing methods when the missingness probability model is incorrect. Simulation studies are presented, as well as an application to data collected from the US National Health and Nutrition Examination Survey.  相似文献   

15.
Researchers in observational survival analysis are interested in not only estimating survival curve nonparametrically but also having statistical inference for the parameter. We consider right-censored failure time data where we observe n independent and identically distributed observations of a vector random variable consisting of baseline covariates, a binary treatment at baseline, a survival time subject to right censoring, and the censoring indicator. We assume the baseline covariates are allowed to affect the treatment and censoring so that an estimator that ignores covariate information would be inconsistent. The goal is to use these data to estimate the counterfactual average survival curve of the population if all subjects are assigned the same treatment at baseline. Existing observational survival analysis methods do not result in monotone survival curve estimators, which is undesirable and may lose efficiency by not constraining the shape of the estimator using the prior knowledge of the estimand. In this paper, we present a one-step Targeted Maximum Likelihood Estimator (TMLE) for estimating the counterfactual average survival curve. We show that this new TMLE can be executed via recursion in small local updates. We demonstrate the finite sample performance of this one-step TMLE in simulations and an application to a monoclonal gammopathy data.  相似文献   

16.
Dewanji A  Sengupta D 《Biometrics》2003,59(4):1063-1070
In competing risks data, missing failure types (causes) is a very common phenomenon. In this work, we consider a general missing pattern in which, if a failure type is not observed, one observes a set of possible types containing the true type, along with the failure time. We first consider maximum likelihood estimation with missing-at-random assumption via the expectation maximization (EM) algorithm. We then propose a Nelson-Aalen type estimator for situations when certain information on the conditional probability of the true type given a set of possible failure types is available from the experimentalists. This is based on a least-squares type method using the relationships between hazards for different types and hazards for different combinations of missing types. We conduct a simulation study to investigate the performance of this method, which indicates that bias may be small, even for high proportion of missing data, for sufficiently large number of observations. The estimates are somewhat sensitive to misspecification of the conditional probabilities of the true types when the missing proportion is high. We also consider an example from an animal experiment to illustrate our methodology.  相似文献   

17.
King R  Brooks SP  Coulson T 《Biometrics》2008,64(4):1187-1195
SUMMARY: We consider the issue of analyzing complex ecological data in the presence of covariate information and model uncertainty. Several issues can arise when analyzing such data, not least the need to take into account where there are missing covariate values. This is most acutely observed in the presence of time-varying covariates. We consider mark-recapture-recovery data, where the corresponding recapture probabilities are less than unity, so that individuals are not always observed at each capture event. This often leads to a large amount of missing time-varying individual covariate information, because the covariate cannot usually be recorded if an individual is not observed. In addition, we address the problem of model selection over these covariates with missing data. We consider a Bayesian approach, where we are able to deal with large amounts of missing data, by essentially treating the missing values as auxiliary variables. This approach also allows a quantitative comparison of different models via posterior model probabilities, obtained via the reversible jump Markov chain Monte Carlo algorithm. To demonstrate this approach we analyze data relating to Soay sheep, which pose several statistical challenges in fully describing the intricacies of the system.  相似文献   

18.
Wen CC  Lin CT 《Biometrics》2011,67(3):760-769
Statistical inference based on right-censored data for the proportional hazards (PH) model with missing covariates has received considerable attention, but interval-censored or current status data with missing covariates has not yet been investigated. Our study is partly motivated by the analysis of fracture data from the 2005 National Health Interview Survey Original Database in Taiwan, where the occurrence of fractures was interval censored and the covariate osteoporosis was not reported for all residents. We assume that the data are realized from a PH model. A semiparametric maximum likelihood estimate implemented by a hybrid algorithm is proposed to analyze current status data with missing covariates. A comparison of the performance of our method with full-cohort analysis, complete-case analysis, and surrogate analysis is made via simulation with moderate sample sizes. The fracture data are then analyzed.  相似文献   

19.
Roy J  Lin X 《Biometrics》2005,61(3):837-846
We consider estimation in generalized linear mixed models (GLMM) for longitudinal data with informative dropouts. At the time a unit drops out, time-varying covariates are often unobserved in addition to the missing outcome. However, existing informative dropout models typically require covariates to be completely observed. This assumption is not realistic in the presence of time-varying covariates. In this article, we first study the asymptotic bias that would result from applying existing methods, where missing time-varying covariates are handled using naive approaches, which include: (1) using only baseline values; (2) carrying forward the last observation; and (3) assuming the missing data are ignorable. Our asymptotic bias analysis shows that these naive approaches yield inconsistent estimators of model parameters. We next propose a selection/transition model that allows covariates to be missing in addition to the outcome variable at the time of dropout. The EM algorithm is used for inference in the proposed model. Data from a longitudinal study of human immunodeficiency virus (HIV)-infected women are used to illustrate the methodology.  相似文献   

20.
Latent class analysis is an intuitive tool to characterize disease phenotype heterogeneity. With data more frequently collected on multiple phenotypes in chronic disease studies, it is of rising interest to investigate how the latent classes embedded in one phenotype are related to another phenotype. Motivated by a cohort with mild cognitive impairment (MCI) from the Uniform Data Set (UDS), we propose and study a time-dependent structural model to evaluate the association between latent classes and competing risk outcomes that are subject to missing failure types. We develop a two-step estimation procedure which circumvents latent class membership assignment and is rigorously justified in terms of accounting for the uncertainty in classifying latent classes. The new method also properly addresses the realistic complications for competing risks outcomes, including random censoring and missing failure types. The asymptotic properties of the resulting estimator are established. Given that the standard bootstrapping inference is not feasible in the current problem setting, we develop analytical inference procedures, which are easy to implement. Our simulation studies demonstrate the advantages of the proposed method over benchmark approaches. We present an application to the MCI data from UDS, which uncovers a detailed picture of the neuropathological relevance of the baseline MCI subgroups.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号