首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Schafer DW 《Biometrics》2001,57(1):53-61
This paper presents an EM algorithm for semiparametric likelihood analysis of linear, generalized linear, and nonlinear regression models with measurement errors in explanatory variables. A structural model is used in which probability distributions are specified for (a) the response and (b) the measurement error. A distribution is also assumed for the true explanatory variable but is left unspecified and is estimated by nonparametric maximum likelihood. For various types of extra information about the measurement error distribution, the proposed algorithm makes use of available routines that would be appropriate for likelihood analysis of (a) and (b) if the true x were available. Simulations suggest that the semiparametric maximum likelihood estimator retains a high degree of efficiency relative to the structural maximum likelihood estimator based on correct distributional assumptions and can outperform maximum likelihood based on an incorrect distributional assumption. The approach is illustrated on three examples with a variety of structures and types of extra information about the measurement error distribution.  相似文献   

Sexton J  Laake P 《Biometrics》2007,63(2):586-592
In this article, we consider nonparametric regression when covariates are measured with error. Estimation is performed using boosted regression trees, with the sum of the trees forming the estimate of the conditional expectation of the response. Both binary and continuous response regression are investigated. An approach to fitting regression trees when covariates are measured with error is described, and the boosting algorithms consist of its repeated application. The main feature of the approach is that it handles situations where multiple covariates are measured with error. Some simulation results are given as well as its application to data from the Framingham Heart Study.  相似文献   

Thoresen M  Laake P 《Biometrics》2000,56(3):868-872
Measurement error models in logistic regression have received considerable theoretical interest over the past 10-15 years. In this paper, we present the results of a simulation study that compares four estimation methods: the so-called regression calibration method, probit maximum likelihood as an approximation to the logistic maximum likelihood, the exact maximum likelihood method based on a logistic model, and the naive estimator, which is the result of simply ignoring the fact that some of the explanatory variables are measured with error. We have compared the behavior of these methods in a simple, additive measurement error model. We show that, in this situation, the regression calibration method is a very good alternative to more mathematically sophisticated methods.  相似文献   

Search for significant variables in nonparametric additive regression   总被引:1,自引:0,他引:1  
HARDLE  W.; KOROSTELEV  A. 《Biometrika》1996,83(3):541-549

In nutritional epidemiology, dietary intake assessed with a food frequency questionnaire is prone to measurement error. Ignoring the measurement error in covariates causes estimates to be biased and leads to a loss of power. In this paper, we consider an additive error model according to the characteristics of the European Prospective Investigation into Cancer and Nutrition (EPIC)‐InterAct Study data, and derive an approximate maximum likelihood estimation (AMLE) for covariates with measurement error under logistic regression. This method can be regarded as an adjusted version of regression calibration and can provide an approximate consistent estimator. Asymptotic normality of this estimator is established under regularity conditions, and simulation studies are conducted to empirically examine the finite sample performance of the proposed method. We apply AMLE to deal with measurement errors in some interested nutrients of the EPIC‐InterAct Study under a sensitivity analysis framework.  相似文献   

Sugar EA  Wang CY  Prentice RL 《Biometrics》2007,63(1):143-151
Regression calibration, refined regression calibration, and conditional scores estimation procedures are extended to a measurement model that is motivated by nutritional and physical activity epidemiology. Biomarker data, available on a small subset of a study cohort for reasons of cost, are assumed to adhere to a classical measurement error model, while corresponding self-report nutrient consumption or activity-related energy expenditure data are available for the entire cohort. The self-report assessment measurement model includes a person-specific random effect, the mean and variance of which may depend on individual characteristics such as body mass index or ethnicity. Logistic regression is used to relate the disease odds ratio to the actual, but unmeasured, dietary or physical activity exposure. Simulation studies are presented to evaluate and contrast the three estimation procedures, and to provide insight into preferred biomarker subsample size under selected cohort study configurations.  相似文献   

The Cox regression model is a popular model for analyzing the relationship between a covariate vector and a survival endpoint. The standard Cox model assumes a constant covariate effect across the entire covariate domain. However, in many epidemiological and other applications, the covariate of main interest is subject to a threshold effect: a change in the slope at a certain point within the covariate domain. Often, the covariate of interest is subject to some degree of measurement error. In this paper, we study measurement error correction in the case where the threshold is known. Several bias correction methods are examined: two versions of regression calibration (RC1 and RC2, the latter of which is new), two methods based on the induced relative risk under a rare event assumption (RR1 and RR2, the latter of which is new), a maximum pseudo-partial likelihood estimator (MPPLE), and simulation-extrapolation (SIMEX). We develop the theory, present simulations comparing the methods, and illustrate their use on data concerning the relationship between chronic air pollution exposure to particulate matter PM10 and fatal myocardial infarction (Nurses Health Study (NHS)), and on data concerning the effect of a subject's long-term underlying systolic blood pressure level on the risk of cardiovascular disease death (Framingham Heart Study (FHS)). The simulations indicate that the best methods are RR2 and MPPLE.  相似文献   

Ibrahim JG  Chen MH  Lipsitz SR 《Biometrics》1999,55(2):591-596
We propose a method for estimating parameters for general parametric regression models with an arbitrary number of missing covariates. We allow any pattern of missing data and assume that the missing data mechanism is ignorable throughout. When the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). We extend this method to continuous or mixed categorical and continuous covariates, and for arbitrary parametric regression models, by adapting a Monte Carlo version of the EM algorithm as discussed by Wei and Tanner (1990, Journal of the American Statistical Association 85, 699-704). In addition, we discuss the Gibbs sampler for sampling from the conditional distribution of the missing covariates given the observed data and show that the appropriate complete conditionals are log-concave. The log-concavity property of the conditional distributions will facilitate a straightforward implementation of the Gibbs sampler via the adaptive rejection algorithm of Gilks and Wild (1992, Applied Statistics 41, 337-348). We assume the model for the response given the covariates is an arbitrary parametric regression model, such as a generalized linear model, a parametric survival model, or a nonlinear model. We model the marginal distribution of the covariates as a product of one-dimensional conditional distributions. This allows us a great deal of flexibility in modeling the distribution of the covariates and reduces the number of nuisance parameters that are introduced in the E-step. We present examples involving both simulated and real data.  相似文献   

The effects of measurement error on parameter estimation   总被引:2,自引:0,他引:2  
STEFANSKI  LEONARD A. 《Biometrika》1985,72(3):583-592

It is widely believed that risks of many complex diseases are determined by genetic susceptibilities, environmental exposures, and their interaction. Chatterjee and Carroll (2005, Biometrika 92, 399-418) developed an efficient retrospective maximum-likelihood method for analysis of case-control studies that exploits an assumption of gene-environment independence and leaves the distribution of the environmental covariates to be completely nonparametric. Spinka, Carroll, and Chatterjee (2005, Genetic Epidemiology 29, 108-127) extended this approach to studies where certain types of genetic information, such as haplotype phases, may be missing on some subjects. We further extend this approach to situations when some of the environmental exposures are measured with error. Using a polychotomous logistic regression model, we allow disease status to have K+ 1 levels. We propose use of a pseudolikelihood and a related EM algorithm for parameter estimation. We prove consistency and derive the resulting asymptotic covariance matrix of parameter estimates when the variance of the measurement error is known and when it is estimated using replications. Inferences with measurement error corrections are complicated by the fact that the Wald test often behaves poorly in the presence of large amounts of measurement error. The likelihood-ratio (LR) techniques are known to be a good alternative. However, the LR tests are not technically correct in this setting because the likelihood function is based on an incorrect model, i.e., a prospective model in a retrospective sampling scheme. We corrected standard asymptotic results to account for the fact that the LR test is based on a likelihood-type function. The performance of the proposed method is illustrated using simulation studies emphasizing the case when genetic information is in the form of haplotypes and missing data arises from haplotype-phase ambiguity. An application of our method is illustrated using a population-based case-control study of the association between calcium intake and the risk of colorectal adenoma.  相似文献   

Zucker DM  Spiegelman D 《Biometrics》2004,60(2):324-334
We consider the Cox proportional hazards model with discrete-valued covariates subject to misclassification. We present a simple estimator of the regression parameter vector for this model. The estimator is based on a weighted least squares analysis of weighted-averaged transformed Kaplan-Meier curves for the different possible configurations of the observed covariate vector. Optimal weighting of the transformed Kaplan-Meier curves is described. The method is designed for the case in which the misclassification rates are known or are estimated from an external validation study. A hybrid estimator for situations with an internal validation study is also described. When there is no misclassification, the regression coefficient vector is small in magnitude, and the censoring distribution does not depend on the covariates, our estimator has the same asymptotic covariance matrix as the Cox partial likelihood estimator. We present results of a finite-sample simulation study under Weibull survival in the setting of a single binary covariate with known misclassification rates. In this simulation study, our estimator performed as well as or, in a few cases, better than the full Weibull maximum likelihood estimator. We illustrate the method on data from a study of the relationship between trans-unsaturated dietary fat consumption and cardiovascular disease incidence.  相似文献   

Huang YH  Hwang WH  Chen FY 《Biometrics》2011,67(4):1471-1480
Measurement errors in covariates may result in biased estimates in regression analysis. Most methods to correct this bias assume nondifferential measurement errors-i.e., that measurement errors are independent of the response variable. However, in regression models for zero-truncated count data, the number of error-prone covariate measurements for a given observational unit can equal its response count, implying a situation of differential measurement errors. To address this challenge, we develop a modified conditional score approach to achieve consistent estimation. The proposed method represents a novel technique, with efficiency gains achieved by augmenting random errors, and performs well in a simulation study. The method is demonstrated in an ecology application.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号