首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper we study the Buckley-James estimator of accelerated failure time models with auxiliary covariates. Instead of postulating distributional assumptions on the auxiliary covariates, we use a local polynomial approximation method to accommodate them into the Buckley-James estimating equations. The regression parameters are obtained iteratively by minimizing a consecutive distance of the estimates. Asymptotic properties of the proposed estimator are investigated. Simulation studies show that the efficiency gain of using auxiliary information is remarkable when compared to just using the validation sample. The method is applied to the PBC data from the Mayo Clinic trial in primary biliary cirrhosis as an illustration.  相似文献   

2.
Wang CY  Huang WT 《Biometrics》2000,56(1):98-105
We consider estimation in logistic regression where some covariate variables may be missing at random. Satten and Kupper (1993, Journal of the American Statistical Association 88, 200-208) proposed estimating odds ratio parameters using methods based on the probability of exposure. By approximating a partial likelihood, we extend their idea and propose a method that estimates the cumulant-generating function of the missing covariate given observed covariates and surrogates in the controls. Our proposed method first estimates some lower order cumulants of the conditional distribution of the unobserved data and then solves a resulting estimating equation for the logistic regression parameter. A simple version of the proposed method is to replace a missing covariate by the summation of its conditional mean and conditional variance given observed data in the controls. We note that one important property of the proposed method is that, when the validation is only on controls, a class of inverse selection probability weighted semiparametric estimators cannot be applied because selection probabilities on cases are zeroes. The proposed estimator performs well unless the relative risk parameters are large, even though it is technically inconsistent. Small-sample simulations are conducted. We illustrate the method by an example of real data analysis.  相似文献   

3.
We propose an extension to the estimating equations in generalized linear models to estimate parameters in the link function and variance structure simultaneously with regression coefficients. Rather than focusing on the regression coefficients, the purpose of these models is inference about the mean of the outcome as a function of a set of covariates, and various functionals of the mean function used to measure the effects of the covariates. A commonly used functional in econometrics, referred to as the marginal effect, is the partial derivative of the mean function with respect to any covariate, averaged over the empirical distribution of covariates in the model. We define an analogous parameter for discrete covariates. The proposed estimation method not only helps to identify an appropriate link function and to suggest an underlying distribution for a specific application but also serves as a robust estimator when no specific distribution for the outcome measure can be identified. Using Monte Carlo simulations, we show that the resulting parameter estimators are consistent. The method is illustrated with an analysis of inpatient expenditure data from a study of hospitalists.  相似文献   

4.
For multicenter randomized trials or multilevel observational studies, the Cox regression model has long been the primary approach to study the effects of covariates on time-to-event outcomes. A critical assumption of the Cox model is the proportionality of the hazard functions for modeled covariates, violations of which can result in ambiguous interpretations of the hazard ratio estimates. To address this issue, the restricted mean survival time (RMST), defined as the mean survival time up to a fixed time in a target population, has been recommended as a model-free target parameter. In this article, we generalize the RMST regression model to clustered data by directly modeling the RMST as a continuous function of restriction times with covariates while properly accounting for within-cluster correlations to achieve valid inference. The proposed method estimates regression coefficients via weighted generalized estimating equations, coupled with a cluster-robust sandwich variance estimator to achieve asymptotically valid inference with a sufficient number of clusters. In small-sample scenarios where a limited number of clusters are available, however, the proposed sandwich variance estimator can exhibit negative bias in capturing the variability of regression coefficient estimates. To overcome this limitation, we further propose and examine bias-corrected sandwich variance estimators to reduce the negative bias of the cluster-robust sandwich variance estimator. We study the finite-sample operating characteristics of proposed methods through simulations and reanalyze two multicenter randomized trials.  相似文献   

5.
Horton NJ  Laird NM 《Biometrics》2001,57(1):34-42
This article presents a new method for maximum likelihood estimation of logistic regression models with incomplete covariate data where auxiliary information is available. This auxiliary information is extraneous to the regression model of interest but predictive of the covariate with missing data. Ibrahim (1990, Journal of the American Statistical Association 85, 765-769) provides a general method for estimating generalized linear regression models with missing covariates using the EM algorithm that is easily implemented when there is no auxiliary data. Vach (1997, Statistics in Medicine 16, 57-72) describes how the method can be extended when the outcome and auxiliary data are conditionally independent given the covariates in the model. The method allows the incorporation of auxiliary data without making the conditional independence assumption. We suggest tests of conditional independence and compare the performance of several estimators in an example concerning mental health service utilization in children. Using an artificial dataset, we compare the performance of several estimators when auxiliary data are available.  相似文献   

6.
A predictive continuous time model is developed for continuous panel data to assess the effect of time‐varying covariates on the general direction of the movement of a continuous response that fluctuates over time. This is accomplished by reparameterizing the infinitesimal mean of an Ornstein–Uhlenbeck processes in terms of its equilibrium mean and a drift parameter, which assesses the rate that the process reverts to its equilibrium mean. The equilibrium mean is modeled as a linear predictor of covariates. This model can be viewed as a continuous time first‐order autoregressive regression model with time‐varying lag effects of covariates and the response, which is more appropriate for unequally spaced panel data than its discrete time analog. Both maximum likelihood and quasi‐likelihood approaches are considered for estimating the model parameters and their performances are compared through simulation studies. The simpler quasi‐likelihood approach is suggested because it yields an estimator that is of high efficiency relative to the maximum likelihood estimator and it yields a variance estimator that is robust to the diffusion assumption of the model. To illustrate the proposed model, an application to diastolic blood pressure data from a follow‐up study on cardiovascular diseases is presented. Missing observations are handled naturally with this model.  相似文献   

7.
Incomplete covariate data are a common occurrence in studies in which the outcome is survival time. Further, studies in the health sciences often give rise to correlated, possibly censored, survival data. With no missing covariate data, if the marginal distributions of the correlated survival times follow a given parametric model, then the estimates using the maximum likelihood estimating equations, naively treating the correlated survival times as independent, give consistent estimates of the relative risk parameters Lipsitz et al. 1994 50, 842-846. Now, suppose that some observations within a cluster have some missing covariates. We show in this paper that if one naively treats observations within a cluster as independent, that one can still use the maximum likelihood estimating equations to obtain consistent estimates of the relative risk parameters. This method requires the estimation of the parameters of the distribution of the covariates. We present results from a clinical trial Lipsitz and Ibrahim (1996b) 2, 5-14 with five covariates, four of which have some missing values. In the trial, the clusters are the hospitals in which the patients were treated.  相似文献   

8.
The nonparametric transformation model makes no parametric assumptions on the forms of the transformation function and the error distribution. This model is appealing in its flexibility for modeling censored survival data. Current approaches for estimation of the regression parameters involve maximizing discontinuous objective functions, which are numerically infeasible to implement with multiple covariates. Based on the partial rank (PR) estimator (Khan and Tamer, 2004), we propose a smoothed PR estimator which maximizes a smooth approximation of the PR objective function. The estimator is shown to be asymptotically equivalent to the PR estimator but is much easier to compute when there are multiple covariates. We further propose using the weighted bootstrap, which is more stable than the usual sandwich technique with smoothing parameters, for estimating the standard error. The estimator is evaluated via simulation studies and illustrated with the Veterans Administration lung cancer data set.  相似文献   

9.
Ross EA  Moore D 《Biometrics》1999,55(3):813-819
We have developed methods for modeling discrete or grouped time, right-censored survival data collected from correlated groups or clusters. We assume that the marginal hazard of failure for individual items within a cluster is specified by a linear log odds survival model and the dependence structure is based on a gamma frailty model. The dependence can be modeled as a function of cluster-level covariates. Likelihood equations for estimating the model parameters are provided. Generalized estimating equations for the marginal hazard regression parameters and pseudolikelihood methods for estimating the dependence parameters are also described. Data from two clinical trials are used for illustration purposes.  相似文献   

10.
In this paper, we propose a functional partially linear regression model with latent group structures to accommodate the heterogeneous relationship between a scalar response and functional covariates. The proposed model is motivated by a salinity tolerance study of barley families, whose main objective is to detect salinity tolerant barley plants. Our model is flexible, allowing for heterogeneous functional coefficients while being efficient by pooling information within a group for estimation. We develop an algorithm in the spirit of the K-means clustering to identify latent groups of the subjects under study. We establish the consistency of the proposed estimator, derive the convergence rate and the asymptotic distribution, and develop inference procedures. We show by simulation studies that the proposed method has higher accuracy for recovering latent groups and for estimating the functional coefficients than existing methods. The analysis of the barley data shows that the proposed method can help identify groups of barley families with different salinity tolerant abilities.  相似文献   

11.
In clinical trials of chronic diseases such as acquired immunodeficiency syndrome, cancer, or cardiovascular diseases, the concept of quality-adjusted lifetime (QAL) has received more and more attention. In this paper, we consider the problem of how the covariates affect the mean QAL when the data are subject to right censoring. We allow a very general form for the mean model as a function of covariates. Using the idea of inverse probability weighting, we first construct a simple weighted estimating equation for the parameters in our mean model. We then find the form of the most efficient estimating equation, which yields the most efficient estimator for the regression parameters. Since the most efficient estimator depends on the distribution of the health history processes, and thus cannot be estimated nonparametrically, we consider different approaches for improving the efficiency of the simple weighted estimating equation using observed data. The applicability of these methods is demonstrated by both simulation experiments and a data example from a breast cancer clinical trial study.  相似文献   

12.
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one‐step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster‐deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts.  相似文献   

13.
We consider analyses of case-control studies assembled from electronic health records (EHRs) where the pool of cases is contaminated by patients who are ineligible for the study. These ineligible patients, referred to as “false cases,” should be excluded from the analyses if known. However, the true outcome status of a patient in the case pool is unknown except in a subset whose size may be arbitrarily small compared to the entire pool. To effectively remove the influence of the false cases on estimating odds ratio parameters defined by a working association model of the logistic form, we propose a general strategy to adaptively impute the unknown case status without requiring a correct phenotyping model to help discern the true and false case statuses. Our method estimates the target parameters as the solution to a set of unbiased estimating equations constructed using all available data. It outperforms existing methods by achieving robustness to mismodeling the relationship between the outcome status and covariates of interest, as well as improved estimation efficiency. We further show that our estimator is root-n-consistent and asymptotically normal. Through extensive simulation studies and analysis of real EHR data, we demonstrate that our method has desirable robustness to possible misspecification of both the association and phenotyping models, along with statistical efficiency superior to the competitors.  相似文献   

14.
Median regression with censored cost data   总被引:2,自引:0,他引:2  
Bang H  Tsiatis AA 《Biometrics》2002,58(3):643-649
Because of the skewness of the distribution of medical costs, we consider modeling the median as well as other quantiles when establishing regression relationships to covariates. In many applications, the medical cost data are also right censored. In this article, we propose semiparametric procedures for estimating the parameters in median regression models based on weighted estimating equations when censoring is present. Numerical studies are conducted to show that our estimators perform well with small samples and the resulting inference is reliable in circumstances of practical importance. The methods are applied to a dataset for medical costs of patients with colorectal cancer.  相似文献   

15.
Joint regression analysis of correlated data using Gaussian copulas   总被引:2,自引:0,他引:2  
Song PX  Li M  Yuan Y 《Biometrics》2009,65(1):60-68
Summary .  This article concerns a new joint modeling approach for correlated data analysis. Utilizing Gaussian copulas, we present a unified and flexible machinery to integrate separate one-dimensional generalized linear models (GLMs) into a joint regression analysis of continuous, discrete, and mixed correlated outcomes. This essentially leads to a multivariate analogue of the univariate GLM theory and hence an efficiency gain in the estimation of regression coefficients. The availability of joint probability models enables us to develop a full maximum likelihood inference. Numerical illustrations are focused on regression models for discrete correlated data, including multidimensional logistic regression models and a joint model for mixed normal and binary outcomes. In the simulation studies, the proposed copula-based joint model is compared to the popular generalized estimating equations, which is a moment-based estimating equation method to join univariate GLMs. Two real-world data examples are used in the illustration.  相似文献   

16.
Case-control designs are widely used in rare disease studies. In a typical case-control study, data are collected from a sample of all available subjects who have experienced a disease (cases) and a sub-sample of subjects who have not experienced the disease (controls) in a study cohort. Cases are oversampled in case-control studies. Logistic regression is a common tool to estimate the relative risks of the disease with respect to a set of covariates. Very often in such a study, information of ages-at-onset of the disease for all cases and ages at survey of controls are known. Standard logistic regression analysis using age as a covariate is based on a dichotomous outcome and does not efficiently use such age-at-onset (time-to-event) information. We propose to analyze age-at-onset data using a modified case-cohort method by treating the control group as an approximation of a subcohort assuming rare events. We investigate the asymptotic bias of this approximation and show that the asymptotic bias of the proposed estimator is small when the disease rate is low. We evaluate the finite sample performance of the proposed method through a simulation study and illustrate the method using a breast cancer case-control data set.  相似文献   

17.
Gray RJ 《Biometrics》2000,56(2):571-576
An estimator of the regression parameters in a semiparametric transformed linear survival model is examined. This estimator consists of a single Newton-like update of the solution to a rank-based estimating equation from an initial consistent estimator. An automated penalized likelihood algorithm is proposed for estimating the optimal weight function for the estimating equations and the error hazard function that is needed in the variance estimator. In simulations, the estimated optimal weights are found to give reasonably efficient estimators of the regression parameters, and the variance estimators are found to perform well. The methodology is applied to an analysis of prognostic factors in non-Hodgkin's lymphoma.  相似文献   

18.
Zexi Cai  Tony Sit 《Biometrics》2020,76(4):1201-1215
Quantile regression is a flexible and effective tool for modeling survival data and its relationship with important covariates, which often vary over time. Informative right censoring of data from the prevalent cohort within the population often results in length-biased observations. We propose an estimating equation-based approach to obtain consistent estimators of the regression coefficients of interest based on length-biased observations with time-dependent covariates. In addition, inspired by Zeng and Lin 2008, we also develop a more numerically stable procedure for variance estimation. Large sample properties including consistency and asymptotic normality of the proposed estimator are established. Numerical studies presented demonstrate convincing performance of the proposed estimator under various settings. The application of the proposed method is demonstrated using the Oscar dataset.  相似文献   

19.
Summary Nested case–control (NCC) design is a popular sampling method in large epidemiological studies for its cost effectiveness to investigate the temporal relationship of diseases with environmental exposures or biological precursors. Thomas' maximum partial likelihood estimator is commonly used to estimate the regression parameters in Cox's model for NCC data. In this article, we consider a situation in which failure/censoring information and some crude covariates are available for the entire cohort in addition to NCC data and propose an improved estimator that is asymptotically more efficient than Thomas' estimator. We adopt a projection approach that, heretofore, has only been employed in situations of random validation sampling and show that it can be well adapted to NCC designs where the sampling scheme is a dynamic process and is not independent for controls. Under certain conditions, consistency and asymptotic normality of the proposed estimator are established and a consistent variance estimator is also developed. Furthermore, a simplified approximate estimator is proposed when the disease is rare. Extensive simulations are conducted to evaluate the finite sample performance of our proposed estimators and to compare the efficiency with Thomas' estimator and other competing estimators. Moreover, sensitivity analyses are conducted to demonstrate the behavior of the proposed estimator when model assumptions are violated, and we find that the biases are reasonably small in realistic situations. We further demonstrate the proposed method with data from studies on Wilms' tumor.  相似文献   

20.
In nutritional epidemiology, dietary intake assessed with a food frequency questionnaire is prone to measurement error. Ignoring the measurement error in covariates causes estimates to be biased and leads to a loss of power. In this paper, we consider an additive error model according to the characteristics of the European Prospective Investigation into Cancer and Nutrition (EPIC)‐InterAct Study data, and derive an approximate maximum likelihood estimation (AMLE) for covariates with measurement error under logistic regression. This method can be regarded as an adjusted version of regression calibration and can provide an approximate consistent estimator. Asymptotic normality of this estimator is established under regularity conditions, and simulation studies are conducted to empirically examine the finite sample performance of the proposed method. We apply AMLE to deal with measurement errors in some interested nutrients of the EPIC‐InterAct Study under a sensitivity analysis framework.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号