首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Longitudinal studies are often applied in biomedical research and clinical trials to evaluate the treatment effect. The association pattern within the subject must be considered in both sample size calculation and the analysis. One of the most important approaches to analyze such a study is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which “working correlation structure” is introduced and the association pattern within the subject depends on a vector of association parameters denoted by ρ. The explicit sample size formulas for two‐group comparison in linear and logistic regression models are obtained based on the GEE method by Liu and Liang. For cluster randomized trials (CRTs), researchers proposed the optimal sample sizes at both the cluster and individual level as a function of sampling costs and the intracluster correlation coefficient (ICC). In these approaches, the optimal sample sizes depend strongly on the ICC. However, the ICC is usually unknown for CRTs and multicenter trials. To overcome this shortcoming, Van Breukelen et al. consider a range of possible ICC values identified from literature reviews and present Maximin designs (MMDs) based on relative efficiency (RE) and efficiency under budget and cost constraints. In this paper, the optimal sample size and number of repeated measurements using GEE models with an exchangeable working correlation matrix is proposed under the considerations of fixed budget, where “optimal” refers to maximum power for a given sampling budget. The equations of sample size and number of repeated measurements for a known parameter value ρ are derived and a straightforward algorithm for unknown ρ is developed. Applications in practice are discussed. We also discuss the existence of the optimal design when an AR(1) working correlation matrix is assumed. Our proposed method can be extended under the scenarios when the true and working correlation matrix are different.  相似文献   

Wang L  Zhou J  Qu A 《Biometrics》2012,68(2):353-360
We consider the penalized generalized estimating equations (GEEs) for analyzing longitudinal data with high-dimensional covariates, which often arise in microarray experiments and large-scale health studies. Existing high-dimensional regression procedures often assume independent data and rely on the likelihood function. Construction of a feasible joint likelihood function for high-dimensional longitudinal data is challenging, particularly for correlated discrete outcome data. The penalized GEE procedure only requires specifying the first two marginal moments and a working correlation structure. We establish the asymptotic theory in a high-dimensional framework where the number of covariates p(n) increases as the number of clusters n increases, and p(n) can reach the same order as n. One important feature of the new procedure is that the consistency of model selection holds even if the working correlation structure is misspecified. We evaluate the performance of the proposed method using Monte Carlo simulations and demonstrate its application using a yeast cell-cycle gene expression data set.  相似文献   

Generalized estimating equation (GEE) is widely adopted for regression modeling for longitudinal data, taking account of potential correlations within the same subjects. Although the standard GEE assumes common regression coefficients among all the subjects, such an assumption may not be realistic when there is potential heterogeneity in regression coefficients among subjects. In this paper, we develop a flexible and interpretable approach, called grouped GEE analysis, to modeling longitudinal data with allowing heterogeneity in regression coefficients. The proposed method assumes that the subjects are divided into a finite number of groups and subjects within the same group share the same regression coefficient. We provide a simple algorithm for grouping subjects and estimating the regression coefficients simultaneously, and show the asymptotic properties of the proposed estimator. The number of groups can be determined by the cross validation with averaging method. We demonstrate the proposed method through simulation studies and an application to a real data set.  相似文献   

We propose to analyze panel count data using a spline-based semiparametric projected generalized estimating equation (GEE) method with the proportional mean model E(N(t)|Z) = Λ(0)(t) e(β(0)(T)Z). The natural logarithm of the baseline mean function, logΛ(0)(t), is approximated by a monotone cubic B-spline function. The estimates of regression parameters and spline coefficients are obtained by projecting the GEE estimates into the feasible domain using a weighted isotonic regression (IR). The proposed method avoids assuming any parametric structure of the baseline mean function or any stochastic model for the underlying counting process. Selection of the working covariance matrix that accounts for overdispersion improves the estimation efficiency and leads to less biased variance estimations. Simulation studies are conducted using different working covariance matrices in the GEE to investigate finite sample performance of the proposed method, to compare the estimation efficiency, and to explore the performance of different variance estimates in presence of overdispersion. Finally, the proposed method is applied to a real data set from a bladder tumor clinical trial.  相似文献   

Summary .   A common and important problem in clustered sampling designs is that the effect of within-cluster exposures (i.e., exposures that vary within clusters) on outcome may be confounded by both measured and unmeasured cluster-level factors (i.e., measurements that do not vary within clusters). When some of these are ill/not accounted for, estimation of this effect through population-averaged models or random-effects models may introduce bias. We accommodate this by developing a general theory for the analysis of clustered data, which enables consistent and asymptotically normal estimation of the effects of within-cluster exposures in the presence of cluster-level confounders. Semiparametric efficient estimators are obtained by solving so-called conditional generalized estimating equations. We compare this approach with a popular proposal by Neuhaus and Kalbfleisch (1998, Biometrics 54, 638–645) who separate the exposure effect into a within- and a between-cluster component within a random intercept model. We find that the latter approach yields consistent and efficient estimators when the model is linear, but is less flexible in terms of model specification. Under nonlinear models, this approach may yield inconsistent and inefficient estimators, though with little bias in most practical settings.  相似文献   

We propose criteria for variable selection in the mean model and for the selection of a working correlation structure in longitudinal data with dropout missingness using weighted generalized estimating equations. The proposed criteria are based on a weighted quasi‐likelihood function and a penalty term. Our simulation results show that the proposed criteria frequently select the correct model in candidate mean models. The proposed criteria also have good performance in selecting the working correlation structure for binary and normal outcomes. We illustrate our approaches using two empirical examples. In the first example, we use data from a randomized double‐blind study to test the cancer‐preventing effects of beta carotene. In the second example, we use longitudinal CD4 count data from a randomized double‐blind study.  相似文献   

Longitudinal data often encounter missingness with monotone and/or intermittent missing patterns. Multiple imputation (MI) has been popularly employed for analysis of missing longitudinal data. In particular, the MI‐GEE method has been proposed for inference of generalized estimating equations (GEE) when missing data are imputed via MI. However, little is known about how to perform model selection with multiply imputed longitudinal data. In this work, we extend the existing GEE model selection criteria, including the “quasi‐likelihood under the independence model criterion” (QIC) and the “missing longitudinal information criterion” (MLIC), to accommodate multiple imputed datasets for selection of the MI‐GEE mean model. According to real data analyses from a schizophrenia study and an AIDS study, as well as simulations under nonmonotone missingness with moderate proportion of missing observations, we conclude that: (i) more than a few imputed datasets are required for stable and reliable model selection in MI‐GEE analysis; (ii) the MI‐based GEE model selection methods with a suitable number of imputations generally perform well, while the naive application of existing model selection methods by simply ignoring missing observations may lead to very poor performance; (iii) the model selection criteria based on improper (frequentist) multiple imputation generally performs better than their analogies based on proper (Bayesian) multiple imputation.  相似文献   

目的:探讨广义估计方程在CT显示方法研究中的应用.方法:采用SAS软件的GENMOD过程,应用广义估计方程方法分析CT显示方法研究实例.结果:给出了广义估计方程SAS程序,并对参数估计和两两比较结果进行解释.结论:广义估计方程能有效的分析CT显示方法研究中反应变量为两分类或多分类的非独立数据.  相似文献   

Wang YG  Lin X  Zhu M 《Biometrics》2005,61(3):684-691
Robust methods are useful in making reliable statistical inferences when there are small deviations from the model assumptions. The widely used method of the generalized estimating equations can be "robustified" by replacing the standardized residuals with the M-residuals. If the Pearson residuals are assumed to be unbiased from zero, parameter estimators from the robust approach are asymptotically biased when error distributions are not symmetric. We propose a distribution-free method for correcting this bias. Our extensive numerical studies show that the proposed method can reduce the bias substantially. Examples are given for illustration.  相似文献   

Transition models are an important framework that can be used to model longitudinal categorical data. They are particularly useful when the primary interest is in prediction. The available methods for this class of models are suitable for the cases in which responses are recorded individually over time. However, in many areas, it is common for categorical data to be recorded as groups, that is, different categories with a number of individuals in each. As motivation we consider a study in insect movement and another in pig behaviou. The first study was developed to understand the movement patterns of female adults of Diaphorina citri, a pest of citrus plantations. The second study investigated how hogs behaved under the influence of environmental enrichment. In both studies, the number of individuals in different response categories was observed over time. We propose a new framework for considering the time dependence in the linear predictor of a generalized logit transition model using a quantitative response, corresponding to the number of individuals in each category. We use maximum likelihood estimation and present the results of the fitted models under stationarity and non-stationarity assumptions, and use recently proposed tests to assess non-stationarity. We evaluated the performance of the proposed model using simulation studies under different scenarios, and concluded that our modeling framework represents a flexible alternative to analyze grouped longitudinal categorical data.  相似文献   

It is widely acknowledged that the analysis of comparative data from related species should be performed taking into account their phylogenetic relationships. We introduce a new method, based on the use of generalized estimating equations (GEE), for the analysis of comparative data. The principle is to incorporate, in the modelling process, a correlation matrix that specifies the dependence among observations. This matrix is obtained from the phylogenetic tree of the studied species. Using this approach, a variety of distributions (discrete or continuous) can be analysed using a generalized linear modelling framework, phylogenies with multichotomies can be analysed, and there is no need to estimate ancestral character state. A simulation study showed that the proposed approach has good statistical properties with a type-I error rate close to the nominal 5%, and statistical power to detect correlated evolution between two characters which increases with the strength of the correlation. The proposed approach performs well for the analysis of discrete characters. We illustrate our approach with some data on macro-ecological correlates in birds. Some extensions of the use of GEE are discussed.  相似文献   

FitzGerald PE 《Biometrics》2002,58(4):718-726
In this article, we assess the performance of two standard, but naive, methods for handling incomplete familial data in GEE2 analyses when the outcome is binary. We also propose a new method for analyzing such data using GEE2 when explanatory variables are discrete. Unlike the naive methods, the new method does not require the missing data process to be ignorable. We illustrate our method with an example that examines the familial aggregation of obesity.  相似文献   

Leung Lai T  Shih MC  Wong SP 《Biometrics》2006,62(1):159-167
To circumvent the computational complexity of likelihood inference in generalized mixed models that assume linear or more general additive regression models of covariate effects, Laplace's approximations to multiple integrals in the likelihood have been commonly used without addressing the issue of adequacy of the approximations for individuals with sparse observations. In this article, we propose a hybrid estimation scheme to address this issue. The likelihoods for subjects with sparse observations use Monte Carlo approximations involving importance sampling, while Laplace's approximation is used for the likelihoods of other subjects that satisfy a certain diagnostic check on the adequacy of Laplace's approximation. Because of its computational tractability, the proposed approach allows flexible modeling of covariate effects by using regression splines and model selection procedures for knot and variable selection. Its computational and statistical advantages are illustrated by simulation and by application to longitudinal data from a fecundity study of fruit flies, for which overdispersion is modeled via a double exponential family.  相似文献   

Sutradhar BC  Das K 《Biometrics》2000,56(2):622-625
Liang and Zeger (1986, Biometrika 73, 13-22) introduced a generalized estimating equation (GEE) approach based on a working correlation matrix to obtain efficient estimators of regression parameters in the class of generalized linear models for repeated measures data. As demonstrated by Crowder (1995, Biometrika 82, 407-410), because of uncertainty of the definition of the working correlation matrix, the Liang-Zeger approach may, in some cases, lead to a complete breakdown of the estimation of the regression parameters. After taking this comment of Crowder into account, recently Sutradhar and Das (1999, Biometrika 86, 459-465) examined the loss of efficiency of the regression estimators due to misspecification of the correlation structures. But their study was confined to the regression estimation with cluster-level covariates, as in the original paper of Liang and Zeger. In this paper, we study this efficiency loss problem for the generalized regression models with within-cluster covariates by utilizing the approach of Sutradhar and Das (1999).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号