首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
Stratified Cox regression models with large number of strata and small stratum size are useful in many settings, including matched case-control family studies. In the presence of measurement error in covariates and a large number of strata, we show that extensions of existing methods fail either to reduce the bias or to correct the bias under nonsymmetric distributions of the true covariate or the error term. We propose a nonparametric correction method for the estimation of regression coefficients, and show that the estimators are asymptotically consistent for the true parameters. Small sample properties are evaluated in a simulation study. The method is illustrated with an analysis of Framingham data.  相似文献   

2.
In biomedical cohort studies for assessing the association between an outcome variable and a set of covariates, usually, some covariates can only be measured on a subgroup of study subjects. An important design question is—which subjects to select into the subgroup to increase statistical efficiency. When the outcome is binary, one may adopt a case-control sampling design or a balanced case-control design where cases and controls are further matched on a small number of complete discrete covariates. While the latter achieves success in estimating odds ratio (OR) parameters for the matching covariates, similar two-phase design options have not been explored for the remaining covariates, especially the incompletely collected ones. This is of great importance in studies where the covariates of interest cannot be completely collected. To this end, assuming that an external model is available to relate the outcome and complete covariates, we propose a novel sampling scheme that oversamples cases and controls with worse goodness-of-fit based on the external model and further matches them on complete covariates similarly to the balanced design. We develop a pseudolikelihood method for estimating OR parameters. Through simulation studies and explorations in a real-cohort study, we find that our design generally leads to reduced asymptotic variances of the OR estimates and the reduction for the matching covariates is comparable to that of the balanced design.  相似文献   

3.
Chen J  Rodriguez C 《Biometrics》2007,63(4):1099-1107
Genetic epidemiologists routinely assess disease susceptibility in relation to haplotypes, that is, combinations of alleles on a single chromosome. We study statistical methods for inferring haplotype-related disease risk using single nucleotide polymorphism (SNP) genotype data from matched case-control studies, where controls are individually matched to cases on some selected factors. Assuming a logistic regression model for haplotype-disease association, we propose two conditional likelihood approaches that address the issue that haplotypes cannot be inferred with certainty from SNP genotype data (phase ambiguity). One approach is based on the likelihood of disease status conditioned on the total number of cases, genotypes, and other covariates within each matching stratum, and the other is based on the joint likelihood of disease status and genotypes conditioned only on the total number of cases and other covariates. The joint-likelihood approach is generally more efficient, particularly for assessing haplotype-environment interactions. Simulation studies demonstrated that the first approach was more robust to model assumptions on the diplotype distribution conditioned on environmental risk variables and matching factors in the control population. We applied the two methods to analyze a matched case-control study of prostate cancer.  相似文献   

4.
We introduce a new method, moment reconstruction, of correcting for measurement error in covariates in regression models. The central idea is similar to regression calibration in that the values of the covariates that are measured with error are replaced by "adjusted" values. In regression calibration the adjusted value is the expectation of the true value conditional on the measured value. In moment reconstruction the adjusted value is the variance-preserving empirical Bayes estimate of the true value conditional on the outcome variable. The adjusted values thereby have the same first two moments and the same covariance with the outcome variable as the unobserved "true" covariate values. We show that moment reconstruction is equivalent to regression calibration in the case of linear regression, but leads to different results for logistic regression. For case-control studies with logistic regression and covariates that are normally distributed within cases and controls, we show that the resulting estimates of the regression coefficients are consistent. In simulations we demonstrate that for logistic regression, moment reconstruction carries less bias than regression calibration, and for case-control studies is superior in mean-square error to the standard regression calibration approach. Finally, we give an example of the use of moment reconstruction in linear discriminant analysis and a nonstandard problem where we wish to adjust a classification tree for measurement error in the explanatory variables.  相似文献   

5.
We study a linear mixed effects model for longitudinal data, where the response variable and covariates with fixed effects are subject to measurement error. We propose a method of moment estimation that does not require any assumption on the functional forms of the distributions of random effects and other random errors in the model. For a classical measurement error model we apply the instrumental variable approach to ensure identifiability of the parameters. Our methodology, without instrumental variables, can be applied to Berkson measurement errors. Using simulation studies, we investigate the finite sample performances of the estimators and show the impact of measurement error on the covariates and the response on the estimation procedure. The results show that our method performs quite satisfactory, especially for the fixed effects with measurement error (even under misspecification of measurement error model). This method is applied to a real data example of a large birth and child cohort study.  相似文献   

6.
X Liu  K Y Liang 《Biometrics》1992,48(2):645-654
Ignoring measurement error may cause bias in the estimation of regression parameters. When the true covariates are unobservable, multiple imprecise measurements can be used in the analysis to correct for the associated bias. We suggest a simple estimating procedure that gives consistent estimates of regression parameters by using the repeated measurements with error. The relative Pitman efficiency of our estimator based on models with and without measurement error has been found to be a simple function of the number of replicates and the ratio of intra- to inter-variance of the true covariate. The procedure thus provides a guide for deciding the number of repeated measurements in the design stage. An example from a survey study is presented.  相似文献   

7.
Song X  Wang CY 《Biometrics》2008,64(2):557-566
Summary .   We study joint modeling of survival and longitudinal data. There are two regression models of interest. The primary model is for survival outcomes, which are assumed to follow a time-varying coefficient proportional hazards model. The second model is for longitudinal data, which are assumed to follow a random effects model. Based on the trajectory of a subject's longitudinal data, some covariates in the survival model are functions of the unobserved random effects. Estimated random effects are generally different from the unobserved random effects and hence this leads to covariate measurement error. To deal with covariate measurement error, we propose a local corrected score estimator and a local conditional score estimator. Both approaches are semiparametric methods in the sense that there is no distributional assumption needed for the underlying true covariates. The estimators are shown to be consistent and asymptotically normal. However, simulation studies indicate that the conditional score estimator outperforms the corrected score estimator for finite samples, especially in the case of relatively large measurement error. The approaches are demonstrated by an application to data from an HIV clinical trial.  相似文献   

8.
We study bias-reduced estimators of exponentially transformed parameters in general linear models (GLMs) and show how they can be used to obtain bias-reduced conditional (or unconditional) odds ratios in matched case-control studies. Two options are considered and compared: the explicit approach and the implicit approach. The implicit approach is based on the modified score function where bias-reduced estimates are obtained by using iterative procedures to solve the modified score equations. The explicit approach is shown to be a one-step approximation of this iterative procedure. To apply these approaches for the conditional analysis of matched case-control studies, with potentially unmatched confounding and with several exposures, we utilize the relation between the conditional likelihood and the likelihood of the unconditional logit binomial GLM for matched pairs and Cox partial likelihood for matched sets with appropriately setup data. The properties of the estimators are evaluated by using a large Monte Carlo simulation study and an illustration of a real dataset is shown. Researchers reporting the results on the exponentiated scale should use bias-reduced estimators since otherwise the effects can be under or overestimated, where the magnitude of the bias is especially large in studies with smaller sample sizes.  相似文献   

9.
Guolo A 《Biometrics》2008,64(4):1207-1214
SUMMARY: We investigate the use of prospective likelihood methods to analyze retrospective case-control data where some of the covariates are measured with error. We show that prospective methods can be applied and the case-control sampling scheme can be ignored if one adequately models the distribution of the error-prone covariates in the case-control sampling scheme. Indeed, subject to this, the prospective likelihood methods result in consistent estimates and information standard errors are asymptotically correct. However, the distribution of such covariates is not the same in the population and under case-control sampling, dictating the need to model the distribution flexibly. In this article, we illustrate the general principle by modeling the distribution of the continuous error-prone covariates using the skewnormal distribution. The performance of the method is evaluated through simulation studies, which show satisfactory results in terms of bias and coverage. Finally, the method is applied to the analysis of two data sets which refer, respectively, to a cholesterol study and a study on breast cancer.  相似文献   

10.
Menggang Yu  Bin Nan 《Biometrics》2010,66(2):405-414
Summary In large cohort studies, it often happens that some covariates are expensive to measure and hence only measured on a validation set. On the other hand, relatively cheap but error‐prone measurements of the covariates are available for all subjects. Regression calibration (RC) estimation method ( Prentice, 1982 , Biometrika 69 , 331–342) is a popular method for analyzing such data and has been applied to the Cox model by Wang et al. (1997, Biometrics 53 , 131–145) under normal measurement error and rare disease assumptions. In this article, we consider the RC estimation method for the semiparametric accelerated failure time model with covariates subject to measurement error. Asymptotic properties of the proposed method are investigated under a two‐phase sampling scheme for validation data that are selected via stratified random sampling, resulting in neither independent nor identically distributed observations. We show that the estimates converge to some well‐defined parameters. In particular, unbiased estimation is feasible under additive normal measurement error models for normal covariates and under Berkson error models. The proposed method performs well in finite‐sample simulation studies. We also apply the proposed method to a depression mortality study.  相似文献   

11.
Ko H  Davidian M 《Biometrics》2000,56(2):368-375
The nonlinear mixed effects model is used to represent data in pharmacokinetics, viral dynamics, and other areas where an objective is to elucidate associations among individual-specific model parameters and covariates; however, covariates may be measured with error. For additive measurement error, we show substitution of mismeasured covariates for true covariates may lead to biased estimators for fixed effects and random effects covariance parameters, while regression calibration may eliminate bias in fixed effects but fail to correct that in covariance parameters. We develop methods to take account of measurement error that correct this bias and may be implemented with standard software, and we demonstrate their utility via simulation and application to data from a study of HIV dynamics.  相似文献   

12.
Liang Li  Bo Hu  Tom Greene 《Biometrics》2009,65(3):737-745
Summary .  In many longitudinal clinical studies, the level and progression rate of repeatedly measured biomarkers on each subject quantify the severity of the disease and that subject's susceptibility to progression of the disease. It is of scientific and clinical interest to relate such quantities to a later time-to-event clinical endpoint such as patient survival. This is usually done with a shared parameter model. In such models, the longitudinal biomarker data and the survival outcome of each subject are assumed to be conditionally independent given subject-level severity or susceptibility (also called frailty in statistical terms). In this article, we study the case where the conditional distribution of longitudinal data is modeled by a linear mixed-effect model, and the conditional distribution of the survival data is given by a Cox proportional hazard model. We allow unknown regression coefficients and time-dependent covariates in both models. The proposed estimators are maximizers of an exact correction to the joint log likelihood with the frailties eliminated as nuisance parameters, an idea that originated from correction of covariate measurement error in measurement error models. The corrected joint log likelihood is shown to be asymptotically concave and leads to consistent and asymptotically normal estimators. Unlike most published methods for joint modeling, the proposed estimation procedure does not rely on distributional assumptions of the frailties. The proposed method was studied in simulations and applied to a data set from the Hemodialysis Study.  相似文献   

13.
Satten GA  Carroll RJ 《Biometrics》2000,56(2):384-388
We consider methods for analyzing categorical regression models when some covariates (Z) are completely observed but other covariates (X) are missing for some subjects. When data on X are missing at random (i.e., when the probability that X is observed does not depend on the value of X itself), we present a likelihood approach for the observed data that allows the same nuisance parameters to be eliminated in a conditional analysis as when data are complete. An example of a matched case-control study is used to demonstrate our approach.  相似文献   

14.
Andreas Lindén  Jonas Knape 《Oikos》2009,118(5):675-680
Within the paradigm of population dynamics a central task is to identify environmental factors affecting population change and to estimate the strength of these effects. We here investigate the impact of observation errors in measurements of population densities on estimates of environmental effects. Adding observation errors may change the autocorrelation of a population time series with potential consequences for estimates of effects of autocorrelated environmental covariates. Using Monte Carlo simulations, we compare the performance of maximum likelihood estimates from three stochastic versions of the Gompertz model (log–linear first order autoregressive model), assuming 1) process error only, 2) observation error only, and 3) both process and observation error (the linear state–space model on log‐scale). We also simulated population dynamics using the Ricker model, and evaluated the corresponding maximum likelihood estimates for process error models. When there is observation error in the data and the considered environmental variable is strongly autocorrelated, its estimated effect is likely to be biased when using process error models. The environmental effect is overestimated when the sign of the autocorrelations of the intrinsic dynamics and the environment are the same and underestimated when the signs differ. With non‐autocorrelated environmental covariates, process error models produce fairly exact point estimates as well as reliable confidence intervals for environmental effects. In all scenarios, observation error models produce unbiased estimates with reasonable precision, but confidence intervals derived from the likelihood profiles are far too optimistic if there is process error present. The safest approach is to use state–space models in presence of observation error. These are factors worthwhile to consider when interpreting earlier empirical results on population time series, and in future studies, we recommend choosing carefully the modelling approach with respect to intrinsic population dynamics and covariate autocorrelation.  相似文献   

15.
The supplemented case-control design consists of a case-control sample and of an additional sample of disease-free subjects who arise from a given stratum of one of the measured exposures in the case-control study. The supplemental data might, for example, arise from a population survey conducted independently of the case-control study. This design improves precision of estimates of main effects and especially of joint exposures, particularly when joint exposures are uncommon and the prevalence of one of the exposures is low. We first present a pseudo-likelihood estimator (PLE) that is easy to compute. We further adapt two-phase design methods to find maximum likelihood estimates (MLEs) for the log odds ratios for this design and derive asymptotic variance estimators that appropriately account for the differences in sampling schemes of this design from that of the traditional two-phase design. As an illustration of our design we present a study that was conducted to assess the influence to joint exposure of hepatitis-B virus (HBV) and hepatitis-C virus (HCV) infection on the risk of hepatocellular carcinoma in data from Qidong County, Jiangsu Province, China.  相似文献   

16.
It has been well known that ignoring measurement error may result in substantially biased estimates in many contexts including linear and nonlinear regressions. For survival data with measurement error in covariates, there has been extensive discussion in the literature with the focus on proportional hazards (PH) models. Recently, research interest has extended to accelerated failure time (AFT) and additive hazards (AH) models. However, the impact of measurement error on other models, such as the proportional odds model, has received relatively little attention, although these models are important alternatives when PH, AFT, or AH models are not appropriate to fit data. In this paper, we investigate this important problem and study the bias induced by the naive approach of ignoring covariate measurement error. To adjust for the induced bias, we describe the simulation‐extrapolation method. The proposed method enjoys a number of appealing features. Its implementation is straightforward and can be accomplished with minor modifications of existing software. More importantly, the proposed method does not require modeling the covariate process, which is quite attractive in practice. As the precise values of error‐prone covariates are often not observable, any modeling assumption on such covariates has the risk of model misspecification, hence yielding invalid inferences if this happens. The proposed method is carefully assessed both theoretically and empirically. Theoretically, we establish the asymptotic normality for resulting estimators. Numerically, simulation studies are carried out to evaluate the performance of the estimators as well as the impact of ignoring measurement error, along with an application to a data set arising from the Busselton Health Study. Sensitivity of the proposed method to misspecification of the error model is studied as well.  相似文献   

17.
Spatial models for disease mapping should ideally account for covariates measured both at individual and area levels. The newly available “indiCAR” model fits the popular conditional autoregresssive (CAR) model by accommodating both individual and group level covariates while adjusting for spatial correlation in the disease rates. This algorithm has been shown to be effective but assumes log‐linear associations between individual level covariates and outcome. In many studies, the relationship between individual level covariates and the outcome may be non‐log‐linear, and methods to track such nonlinearity between individual level covariate and outcome in spatial regression modeling are not well developed. In this paper, we propose a new algorithm, smooth‐indiCAR, to fit an extension to the popular conditional autoregresssive model that can accommodate both linear and nonlinear individual level covariate effects while adjusting for group level covariates and spatial correlation in the disease rates. In this formulation, the effect of a continuous individual level covariate is accommodated via penalized splines. We describe a two‐step estimation procedure to obtain reliable estimates of individual and group level covariate effects where both individual and group level covariate effects are estimated separately. This distributed computing framework enhances its application in the Big Data domain with a large number of individual/group level covariates. We evaluate the performance of smooth‐indiCAR through simulation. Our results indicate that the smooth‐indiCAR method provides reliable estimates of all regression and random effect parameters. We illustrate our proposed methodology with an analysis of data on neutropenia admissions in New South Wales (NSW), Australia.  相似文献   

18.
Longitudinal data often contain missing observations and error-prone covariates. Extensive attention has been directed to analysis methods to adjust for the bias induced by missing observations. There is relatively little work on investigating the effects of covariate measurement error on estimation of the response parameters, especially on simultaneously accounting for the biases induced by both missing values and mismeasured covariates. It is not clear what the impact of ignoring measurement error is when analyzing longitudinal data with both missing observations and error-prone covariates. In this article, we study the effects of covariate measurement error on estimation of the response parameters for longitudinal studies. We develop an inference method that adjusts for the biases induced by measurement error as well as by missingness. The proposed method does not require the full specification of the distribution of the response vector but only requires modeling its mean and variance structures. Furthermore, the proposed method employs the so-called functional modeling strategy to handle the covariate process, with the distribution of covariates left unspecified. These features, plus the simplicity of implementation, make the proposed method very attractive. In this paper, we establish the asymptotic properties for the resulting estimators. With the proposed method, we conduct sensitivity analyses on a cohort data set arising from the Framingham Heart Study. Simulation studies are carried out to evaluate the impact of ignoring covariate measurement error and to assess the performance of the proposed method.  相似文献   

19.
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used to obtain information additional to the relative risk estimates of covariates.  相似文献   

20.
It is widely believed that risks of many complex diseases are determined by genetic susceptibilities, environmental exposures, and their interaction. Chatterjee and Carroll (2005, Biometrika 92, 399-418) developed an efficient retrospective maximum-likelihood method for analysis of case-control studies that exploits an assumption of gene-environment independence and leaves the distribution of the environmental covariates to be completely nonparametric. Spinka, Carroll, and Chatterjee (2005, Genetic Epidemiology 29, 108-127) extended this approach to studies where certain types of genetic information, such as haplotype phases, may be missing on some subjects. We further extend this approach to situations when some of the environmental exposures are measured with error. Using a polychotomous logistic regression model, we allow disease status to have K+ 1 levels. We propose use of a pseudolikelihood and a related EM algorithm for parameter estimation. We prove consistency and derive the resulting asymptotic covariance matrix of parameter estimates when the variance of the measurement error is known and when it is estimated using replications. Inferences with measurement error corrections are complicated by the fact that the Wald test often behaves poorly in the presence of large amounts of measurement error. The likelihood-ratio (LR) techniques are known to be a good alternative. However, the LR tests are not technically correct in this setting because the likelihood function is based on an incorrect model, i.e., a prospective model in a retrospective sampling scheme. We corrected standard asymptotic results to account for the fact that the LR test is based on a likelihood-type function. The performance of the proposed method is illustrated using simulation studies emphasizing the case when genetic information is in the form of haplotypes and missing data arises from haplotype-phase ambiguity. An application of our method is illustrated using a population-based case-control study of the association between calcium intake and the risk of colorectal adenoma.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号