首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Liu D  Lin X  Ghosh D 《Biometrics》2007,63(4):1079-1088
We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations.  相似文献   

2.
Biomedical researchers are often interested in estimating the effect of an environmental exposure in relation to a chronic disease endpoint. However, the exposure variable of interest may be measured with errors. In a subset of the whole cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies an additive measurement error model, but it may not have repeated measurements. The subset in which the surrogate variables are available is called a calibration sample. In addition to the surrogate variables that are available among the subjects in the calibration sample, we consider the situation when there is an instrumental variable available for all study subjects. An instrumental variable is correlated with the unobserved true exposure variable, and hence can be useful in the estimation of the regression coefficients. In this paper, we propose a nonparametric method for Cox regression using the observed data from the whole cohort. The nonparametric estimator is the best linear combination of a nonparametric correction estimator from the calibration sample and the difference of the naive estimators from the calibration sample and the whole cohort. The asymptotic distribution is derived, and the finite sample performance of the proposed estimator is examined via intensive simulation studies. The methods are applied to the Nutritional Biomarkers Study of the Women's Health Initiative.  相似文献   

3.
The Cox regression model is a popular model for analyzing the relationship between a covariate vector and a survival endpoint. The standard Cox model assumes a constant covariate effect across the entire covariate domain. However, in many epidemiological and other applications, the covariate of main interest is subject to a threshold effect: a change in the slope at a certain point within the covariate domain. Often, the covariate of interest is subject to some degree of measurement error. In this paper, we study measurement error correction in the case where the threshold is known. Several bias correction methods are examined: two versions of regression calibration (RC1 and RC2, the latter of which is new), two methods based on the induced relative risk under a rare event assumption (RR1 and RR2, the latter of which is new), a maximum pseudo-partial likelihood estimator (MPPLE), and simulation-extrapolation (SIMEX). We develop the theory, present simulations comparing the methods, and illustrate their use on data concerning the relationship between chronic air pollution exposure to particulate matter PM10 and fatal myocardial infarction (Nurses Health Study (NHS)), and on data concerning the effect of a subject's long-term underlying systolic blood pressure level on the risk of cardiovascular disease death (Framingham Heart Study (FHS)). The simulations indicate that the best methods are RR2 and MPPLE.  相似文献   

4.
Ye W  Lin X  Taylor JM 《Biometrics》2008,64(4):1238-1246
SUMMARY: In this article we investigate regression calibration methods to jointly model longitudinal and survival data using a semiparametric longitudinal model and a proportional hazards model. In the longitudinal model, a biomarker is assumed to follow a semiparametric mixed model where covariate effects are modeled parametrically and subject-specific time profiles are modeled nonparametrially using a population smoothing spline and subject-specific random stochastic processes. The Cox model is assumed for survival data by including both the current measure and the rate of change of the underlying longitudinal trajectories as covariates, as motivated by a prostate cancer study application. We develop a two-stage semiparametric regression calibration (RC) method. Two variations of the RC method are considered, risk set regression calibration and a computationally simpler ordinary regression calibration. Simulation results show that the two-stage RC approach performs well in practice and effectively corrects the bias from the naive method. We apply the proposed methods to the analysis of a dataset for evaluating the effects of the longitudinal biomarker PSA on the recurrence of prostate cancer.  相似文献   

5.
We investigate methods for regression analysis when covariates are measured with errors. In a subset of the whole cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies the classical measurement error model, but it may not have repeated measurements. In addition to the surrogate variables that are available among the subjects in the calibration sample, we assume that there is an instrumental variable (IV) that is available for all study subjects. An IV is correlated with the unobserved true exposure variable and hence can be useful in the estimation of the regression coefficients. We propose a robust best linear estimator that uses all the available data, which is the most efficient among a class of consistent estimators. The proposed estimator is shown to be consistent and asymptotically normal under very weak distributional assumptions. For Poisson or linear regression, the proposed estimator is consistent even if the measurement error from the surrogate or IV is heteroscedastic. Finite-sample performance of the proposed estimator is examined and compared with other estimators via intensive simulation studies. The proposed method and other methods are applied to a bladder cancer case-control study.  相似文献   

6.
We consider the efficient estimation of a regression parameter in a partially linear additive nonparametric regression model from repeated measures data when the covariates are multivariate. To date, while there is some literature in the scalar covariate case, the problem has not been addressed in the multivariate additive model case. Ours represents a first contribution in this direction. As part of this work, we first describe the behavior of nonparametric estimators for additive models with repeated measures when the underlying model is not additive. These results are critical when one considers variants of the basic additive model. We apply them to the partially linear additive repeated-measures model, deriving an explicit consistent estimator of the parametric component; if the errors are in addition Gaussian, the estimator is semiparametric efficient. We also apply our basic methods to a unique testing problem that arises in genetic epidemiology; in combination with a projection argument we develop an efficient and easily computed testing scheme. Simulations and an empirical example from nutritional epidemiology illustrate our methods.  相似文献   

7.
We consider the statistical modeling and analysis of replicated multi-type point process data with covariates. Such data arise when heterogeneous subjects experience repeated events or failures which may be of several distinct types. The underlying processes are modeled as nonhomogeneous mixed Poisson processes with random (subject) and fixed (covariate) effects. The method of maximum likelihood is used to obtain estimates and standard errors of the failure rate parameters and regression coefficients. Score tests and likelihood ratio statistics are used for covariate selection. A graphical test of goodness of fit of the selected model is based on generalized residuals. Measures for determining the influence of an individual observation on the estimated regression coefficients and on the score test statistic are developed. An application is described to a large ongoing randomized controlled clinical trial for the efficacy of nutritional supplements of selenium for the prevention of two types of skin cancer.  相似文献   

8.
Chen Q  Ibrahim JG 《Biometrics》2006,62(1):177-184
We consider a class of semiparametric models for the covariate distribution and missing data mechanism for missing covariate and/or response data for general classes of regression models including generalized linear models and generalized linear mixed models. Ignorable and nonignorable missing covariate and/or response data are considered. The proposed semiparametric model can be viewed as a sensitivity analysis for model misspecification of the missing covariate distribution and/or missing data mechanism. The semiparametric model consists of a generalized additive model (GAM) for the covariate distribution and/or missing data mechanism. Penalized regression splines are used to express the GAMs as a generalized linear mixed effects model, in which the variance of the corresponding random effects provides an intuitive index for choosing between the semiparametric and parametric model. Maximum likelihood estimates are then obtained via the EM algorithm. Simulations are given to demonstrate the methodology, and a real data set from a melanoma cancer clinical trial is analyzed using the proposed methods.  相似文献   

9.
Model misspecification in proportional hazards regression   总被引:1,自引:0,他引:1  
The proportional hazards model is frequently used to evaluatethe effect of treatment on failure time events in randomisedclinical trials. Concomitant variables are usually availableand may be considered for use in the primary analyses underthe assumption that incorporating them may reduce bias or improveefficiency. In this paper we consider two approaches to includingcovariate information: regression modelling and stratification.We focus on the setting where covariate effects are nonproportionaland we compare the bias, efficiency and coverage propertiesof these approaches. These results indicate that our intuitionbased on linear model analysis of covariance is misleading.Covariate adjustment in proportional hazards models has littleeffect on the variance but may significantly improve the accuracyof the treatment effect estimator.  相似文献   

10.
Zhang D  Lin X  Sowers M 《Biometrics》2000,56(1):31-39
We consider semiparametric regression for periodic longitudinal data. Parametric fixed effects are used to model the covariate effects and a periodic nonparametric smooth function is used to model the time effect. The within-subject correlation is modeled using subject-specific random effects and a random stochastic process with a periodic variance function. We use maximum penalized likelihood to estimate the regression coefficients and the periodic nonparametric time function, whose estimator is shown to be a periodic cubic smoothing spline. We use restricted maximum likelihood to simultaneously estimate the smoothing parameter and the variance components. We show that all model parameters can be easily obtained by fitting a linear mixed model. A common problem in the analysis of longitudinal data is to compare the time profiles of two groups, e.g., between treatment and placebo. We develop a scaled chi-squared test for the equality of two nonparametric time functions. The proposed model and the test are illustrated by analyzing hormone data collected during two consecutive menstrual cycles and their performance is evaluated through simulations.  相似文献   

11.
This paper develops a model for repeated binary regression when a covariate is measured with error. The model allows for estimating the effect of the true value of the covariate on a repeated binary response. The choice of a probit link for the effect of the error-free covariate, coupled with normal measurement error for the error-free covariate, results in a probit model after integrating over the measurement error distribution. We propose a two-stage estimation procedure where, in the first stage, a linear mixed model is used to fit the repeated covariate. In the second stage, a model for the correlated binary responses conditional on the linear mixed model estimates is fit to the repeated binary data using generalized estimating equations. The approach is demonstrated using nutrient safety data from the Diet Intervention of School Age Children (DISC) study.  相似文献   

12.
Gustafson P 《Biometrics》2007,63(1):69-77
Yin and Ibrahim (2005a, Biometrics 61, 208-216) use a Box-Cox transformed hazard model to acknowledge uncertainty about how a linear predictor acts upon the hazard function of a failure-time response. Particularly, additive and proportional hazards models arise for particular values of the transformation parameter. As is often the case, however, this added model flexibility is obtained at the cost of lessened parameter interpretability. Particularly, the interpretation of the coefficients in the linear predictor is intertwined with the value of the transformation parameter. Moreover, some data sets contain very little information about this parameter. To shed light on the situation, we consider average effects based on averaging (over the joint distribution of the explanatory variables and the failure-time response) the partial derivatives of the hazard, or the log-hazard, with respect to the explanatory variables. First, we consider fitting models which do assume a particular form of covariate effects, for example, proportional hazards or additive hazards. In some such circumstances, average effects are seen to be inferential targets which are robust to the effect form being misspecified. Second, we consider average effects as targets of inference when using the transformed hazard model. We show that in addition to being more interpretable inferential targets, average effects can sometimes be estimated more efficiently than the corresponding regression coefficients.  相似文献   

13.
Food records, including 24-hour recalls and diet diaries, are considered to provide generally superior measures of long-term dietary intake relative to questionnaire-based methods. Despite the expense of processing food records, they are increasingly used as the main dietary measurement in nutritional epidemiology, in particular in sub-studies nested within prospective cohorts. Food records are, however, subject to excess reports of zero intake. Measurement error is a serious problem in nutritional epidemiology because of the lack of gold standard measurements and results in biased estimated diet-disease associations. In this paper, a 3-part measurement error model, which we call the never and episodic consumers (NEC) model, is outlined for food records. It allows for both real zeros, due to never consumers, and excess zeros, due to episodic consumers (EC). Repeated measurements are required for some study participants to fit the model. Simulation studies are used to compare the results from using the proposed model to correct for measurement error with the results from 3 alternative approaches: a crude approach using the mean of repeated food record measurements as the exposure, a linear regression calibration (RC) approach, and an EC model which does not allow real zeros. The crude approach results in badly attenuated odds ratio estimates, except in the unlikely situation in which a large number of repeat measurements is available for all participants. Where repeat measurements are available for all participants, the 3 correction methods perform equally well. However, when only a subset of the study population has repeat measurements, the NEC model appears to provide the best method for correcting for measurement error, with the 2 alternative correction methods, in particular the linear RC approach, resulting in greater bias and loss of coverage. The NEC model is extended to include adjustment for measurements from food frequency questionnaires, enabling better estimation of the proportion of never consumers when the number of repeat measurements is small. The methods are applied to 7-day diary measurements of alcohol intake in the EPIC-Norfolk study.  相似文献   

14.
Ryu D  Li E  Mallick BK 《Biometrics》2011,67(2):454-466
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves.  相似文献   

15.
Starting point of the investigations is the time-invariant Wolgograd model applied to a sample of sugar beets. To overcome the weak multicollinearity of the model in its logarithmic form, a ridge-type estimator is applied which uses prior information on the unknown regression coefficients. This is done by introducing the biased minimax-linear estimator. To judge the goodness of the estimates there are calculated the minimax risks of the MILE and the OLSE as well as the estimated maximal crop yields.  相似文献   

16.
Zucker DM  Spiegelman D 《Biometrics》2004,60(2):324-334
We consider the Cox proportional hazards model with discrete-valued covariates subject to misclassification. We present a simple estimator of the regression parameter vector for this model. The estimator is based on a weighted least squares analysis of weighted-averaged transformed Kaplan-Meier curves for the different possible configurations of the observed covariate vector. Optimal weighting of the transformed Kaplan-Meier curves is described. The method is designed for the case in which the misclassification rates are known or are estimated from an external validation study. A hybrid estimator for situations with an internal validation study is also described. When there is no misclassification, the regression coefficient vector is small in magnitude, and the censoring distribution does not depend on the covariates, our estimator has the same asymptotic covariance matrix as the Cox partial likelihood estimator. We present results of a finite-sample simulation study under Weibull survival in the setting of a single binary covariate with known misclassification rates. In this simulation study, our estimator performed as well as or, in a few cases, better than the full Weibull maximum likelihood estimator. We illustrate the method on data from a study of the relationship between trans-unsaturated dietary fat consumption and cardiovascular disease incidence.  相似文献   

17.
Song X  Wang CY 《Biometrics》2008,64(2):557-566
Summary .   We study joint modeling of survival and longitudinal data. There are two regression models of interest. The primary model is for survival outcomes, which are assumed to follow a time-varying coefficient proportional hazards model. The second model is for longitudinal data, which are assumed to follow a random effects model. Based on the trajectory of a subject's longitudinal data, some covariates in the survival model are functions of the unobserved random effects. Estimated random effects are generally different from the unobserved random effects and hence this leads to covariate measurement error. To deal with covariate measurement error, we propose a local corrected score estimator and a local conditional score estimator. Both approaches are semiparametric methods in the sense that there is no distributional assumption needed for the underlying true covariates. The estimators are shown to be consistent and asymptotically normal. However, simulation studies indicate that the conditional score estimator outperforms the corrected score estimator for finite samples, especially in the case of relatively large measurement error. The approaches are demonstrated by an application to data from an HIV clinical trial.  相似文献   

18.
Zhang D  Lin X  Sowers M 《Biometrics》2007,63(2):351-362
The Daily Hormone Study, a substudy of the Study of Women's Health Across the Nation (SWAN) consisting of more than 600 pre- and perimenopausal women, includes a scalar measure of total hip bone mineral density (BMD) together with repeated measures of creatinine-adjusted follicle stimulating hormone (FSH) assayed from daily urine samples collected over one menstrual cycle. It is of scientific interest to investigate the effect of the FSH time profile during a menstrual cycle on total hip BMD, adjusting for age and body mass index. The statistical analysis is challenged by several features of the data: (1) the covariate FSH is measured longitudinally and its effect on the scalar outcome BMD may be complex; (2) due to varying menstrual cycle lengths, subjects have unbalanced longitudinal measures of FSH; and (3) the longitudinal measures of FSH are subject to considerable among- and within-subject variations and measurement errors. We propose a measurement error partial functional linear model, where repeated measures of FSH are modeled using a functional mixed effects model and the effect of the FSH time profile on BMD is modeled using a partial functional linear model by treating the unobserved true subject-specific FSH time profile as a functional covariate. We develop a two-stage nonparametric regression calibration method using period smoothing splines. Using the connection between smoothing splines and mixed models, we show that a key feature of our approach is that estimation at both stages can be conveniently cast into a unified mixed model framework. A simple testing procedure for constant functional covariate effect is also proposed. The proposed methods are evaluated using simulation studies and applied to the SWAN data.  相似文献   

19.
Li L  Shao J  Palta M 《Biometrics》2005,61(3):824-830
Covariate measurement error in regression is typically assumed to act in an additive or multiplicative manner on the true covariate value. However, such an assumption does not hold for the measurement error of sleep-disordered breathing (SDB) in the Wisconsin Sleep Cohort Study (WSCS). The true covariate is the severity of SDB, and the observed surrogate is the number of breathing pauses per unit time of sleep, which has a nonnegative semicontinuous distribution with a point mass at zero. We propose a latent variable measurement error model for the error structure in this situation and implement it in a linear mixed model. The estimation procedure is similar to regression calibration but involves a distributional assumption for the latent variable. Modeling and model-fitting strategies are explored and illustrated through an example from the WSCS.  相似文献   

20.
Huang JZ  Liu L 《Biometrics》2006,62(3):793-802
The Cox proportional hazards model usually assumes an exponential form for the dependence of the hazard function on covariate variables. However, in practice this assumption may be violated and other relative risk forms may be more appropriate. In this article, we consider the proportional hazards model with an unknown relative risk form. Issues in model interpretation are addressed. We propose a method to estimate the relative risk form and the regression parameters simultaneously by first approximating the logarithm of the relative risk form by a spline, and then employing the maximum partial likelihood estimation. An iterative alternating optimization procedure is developed for efficient implementation. Statistical inference of the regression coefficients and of the relative risk form based on parametric asymptotic theory is discussed. The proposed methods are illustrated using simulation and an application to the Veteran's Administration lung cancer data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号