首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Summary The generalized estimating equation (GEE) has been a popular tool for marginal regression analysis with longitudinal data, and its extension, the weighted GEE approach, can further accommodate data that are missing at random (MAR). Model selection methodologies for GEE, however, have not been systematically developed to allow for missing data. We propose the missing longitudinal information criterion (MLIC) for selection of the mean model, and the MLIC for correlation (MLICC) for selection of the correlation structure in GEE when the outcome data are subject to dropout/monotone missingness and are MAR. Our simulation results reveal that the MLIC and MLICC are effective for variable selection in the mean model and selecting the correlation structure, respectively. We also demonstrate the remarkable drawbacks of naively treating incomplete data as if they were complete and applying the existing GEE model selection method. The utility of proposed method is further illustrated by two real applications involving missing longitudinal outcome data.  相似文献   

2.
Within behavioural research, non‐normally distributed data with a complicated structure are common. For instance, data can represent repeated observations of quantities on the same individual. The regression analysis of such data is complicated both by the interdependency of the observations (response variables) and by their non‐normal distribution. Over the last decade, such data have been more and more frequently analysed using generalized mixed‐effect models. Some researchers invoke the heavy machinery of mixed‐effect modelling to obtain the desired population‐level (marginal) inference, which can be achieved by using simpler tools—namely by marginal models. This paper highlights marginal modelling (using generalized estimating equations [GEE]) as an alternative method. In various situations, GEE can be based on fewer assumptions and directly generate estimates (population‐level parameters) which are of immediate interest to the behavioural researcher (such as population means). Using four examples from behavioural research, we demonstrate the use, advantages, and limits of the GEE approach as implemented within the functions of the ‘geepack’ package in R.  相似文献   

3.
Generalized linear models (GLMs) are increasingly being used in daily data analysis. However, model checking for GLMs with correlated discrete response data remains difficult. In this paper, through a case study on marginal logistic regression using a real data set, we illustrate the flexibility and effectiveness of using conditional moment tests (CMTs), along with other graphical methods, to do model checking for generalized estimation equation (GEE) analyses. Although CMTs provide an array of powerful diagnostic tests for model checking, they were originally proposed in the econometrics literature and, to our knowledge, have never been applied to GEE analyses. CMTs cover many existing tests, including the (generalized) score test for an omitted covariate, as special cases. In summary, we believe that CMTs provide a class of useful model checking tools.  相似文献   

4.
Since Liang and Zeger (1986) proposed the ‘generalized estimating equations’ approach for the estimation of regression parameters in models with correlated discrete responses, a lot of work has been devoted to the investigation of the properties of the corresponding GEE estimators. However, the effects of different kinds of covariates have often been overlooked. In this paper it is shown that the use of non-singular block invariant matrices of covariates, as e.g. a design matrix in an analysis of variance model, leads to GEE estimators which are identical regardless of the ‘working’ correlation matrix used. Moreover, they are efficient (McCullagh, 1983). If on the other hand only covariates are used which are invariant within blocks, the efficiency gain in choosing the ‘correct’ vs. an ‘incorrect’ correlation structure is shown to be negligible. The results of a simple simulation study suggest that although different GEE estimators are not identical and are not as efficient as a ML estimator, the differences are still negligible if both types of invariant covariates are present.  相似文献   

5.
This paper considers inference methods for case-control logistic regression in longitudinal setups. The motivation is provided by an analysis of plains bison spatial location as a function of habitat heterogeneity. The sampling is done according to a longitudinal matched case-control design in which, at certain time points, exactly one case, the actual location of an animal, is matched to a number of controls, the alternative locations that could have been reached. We develop inference methods for the conditional logistic regression model in this setup, which can be formulated within a generalized estimating equation (GEE) framework. This permits the use of statistical techniques developed for GEE-based inference, such as robust variance estimators and model selection criteria adapted for non-independent data. The performance of the methods is investigated in a simulation study and illustrated with the bison data analysis.  相似文献   

6.
To date, most statistical developments in QTL detection methodology have been directed at continuous traits with an underlying normal distribution. This paper presents a method for QTL analysis of non-normal traits using a generalized linear mixed model approach. Development of this method has been motivated by a backcross experiment involving two inbred lines of mice that was conducted in order to locate a QTL for litter size. A Poisson regression form is used to model litter size, with allowances made for under- as well as over-dispersion, as suggested by the experimental data. In addition to fixed parity effects, random animal effects have also been included in the model. However, the method is not fully parametric as the model is specified only in terms of means, variances and covariances, and not as a full probability model. Consequently, a generalized estimating equations (GEE) approach is used to fit the model. For statistical inferences, permutation tests and bootstrap procedures are used. This method is illustrated with simulated as well as experimental mouse data. Overall, the method is found to be quite reliable, and with modification, can be used for QTL detection for a range of other non-normally distributed traits.  相似文献   

7.
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one‐step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster‐deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts.  相似文献   

8.
Researchers interested in the association of a predictor with an outcome will often collect information about that predictor from more than one source. Standard multiple regression methods allow estimation of the effect of each predictor on the outcome while controlling for the remaining predictors. The resulting regression coefficient for each predictor has an interpretation that is conditional on all other predictors. In settings in which interest is in comparison of the marginal pairwise relationships between each predictor and the outcome separately (e.g., studies in psychiatry with multiple informants or comparison of the predictive values of diagnostic tests), standard regression methods are not appropriate. Instead, the generalized estimating equations (GEE) approach can be used to simultaneously estimate, and make comparisons among, the separate pairwise marginal associations. In this paper, we consider maximum likelihood (ML) estimation of these marginal relationships when the outcome is binary. ML enjoys benefits over GEE methods in that it is asymptotically efficient, can accommodate missing data that are ignorable, and allows likelihood-based inferences about the pairwise marginal relationships. We also explore the asymptotic relative efficiency of ML and GEE methods in this setting.  相似文献   

9.
GEE with Gaussian estimation of the correlations when data are incomplete   总被引:4,自引:0,他引:4  
This paper considers a modification of generalized estimating equations (GEE) for handling missing binary response data. The proposed method uses Gaussian estimation of the correlation parameters, i.e., the estimating function that yields an estimate of the correlation parameters is obtained from the multivariate normal likelihood. The proposed method yields consistent estimates of the regression parameters when data are missing completely at random (MCAR). However, when data are missing at random (MAR), consistency may not hold. In a simulation study with repeated binary outcomes that are missing at random, the magnitude of the potential bias that can arise is examined. The results of the simulation study indicate that, when the working correlation matrix is correctly specified, the bias is almost negligible for the modified GEE. In the simulation study, the proposed modification of GEE is also compared to the standard GEE, multiple imputation, and weighted estimating equations approaches. Finally, the proposed method is illustrated using data from a longitudinal clinical trial comparing two therapeutic treatments, zidovudine (AZT) and didanosine (ddI), in patients with HIV.  相似文献   

10.
Longitudinal data often encounter missingness with monotone and/or intermittent missing patterns. Multiple imputation (MI) has been popularly employed for analysis of missing longitudinal data. In particular, the MI‐GEE method has been proposed for inference of generalized estimating equations (GEE) when missing data are imputed via MI. However, little is known about how to perform model selection with multiply imputed longitudinal data. In this work, we extend the existing GEE model selection criteria, including the “quasi‐likelihood under the independence model criterion” (QIC) and the “missing longitudinal information criterion” (MLIC), to accommodate multiple imputed datasets for selection of the MI‐GEE mean model. According to real data analyses from a schizophrenia study and an AIDS study, as well as simulations under nonmonotone missingness with moderate proportion of missing observations, we conclude that: (i) more than a few imputed datasets are required for stable and reliable model selection in MI‐GEE analysis; (ii) the MI‐based GEE model selection methods with a suitable number of imputations generally perform well, while the naive application of existing model selection methods by simply ignoring missing observations may lead to very poor performance; (iii) the model selection criteria based on improper (frequentist) multiple imputation generally performs better than their analogies based on proper (Bayesian) multiple imputation.  相似文献   

11.
Generalized estimating equation (GEE) is widely adopted for regression modeling for longitudinal data, taking account of potential correlations within the same subjects. Although the standard GEE assumes common regression coefficients among all the subjects, such an assumption may not be realistic when there is potential heterogeneity in regression coefficients among subjects. In this paper, we develop a flexible and interpretable approach, called grouped GEE analysis, to modeling longitudinal data with allowing heterogeneity in regression coefficients. The proposed method assumes that the subjects are divided into a finite number of groups and subjects within the same group share the same regression coefficient. We provide a simple algorithm for grouping subjects and estimating the regression coefficients simultaneously, and show the asymptotic properties of the proposed estimator. The number of groups can be determined by the cross validation with averaging method. We demonstrate the proposed method through simulation studies and an application to a real data set.  相似文献   

12.
Akaike's information criterion in generalized estimating equations   总被引:15,自引:0,他引:15  
Pan W 《Biometrics》2001,57(1):120-125
Correlated response data are common in biomedical studies. Regression analysis based on the generalized estimating equations (GEE) is an increasingly important method for such data. However, there seem to be few model-selection criteria available in GEE. The well-known Akaike Information Criterion (AIC) cannot be directly applied since AIC is based on maximum likelihood estimation while GEE is nonlikelihood based. We propose a modification to AIC, where the likelihood is replaced by the quasi-likelihood and a proper adjustment is made for the penalty term. Its performance is investigated through simulation studies. For illustration, the method is applied to a real data set.  相似文献   

13.
Efficiency of regression estimates for clustered data   总被引:1,自引:0,他引:1  
Mancl LA  Leroux BG 《Biometrics》1996,52(2):500-511
Statistical methods for clustered data, such as generalized estimating equations (GEE) and generalized least squares (GLS), require selecting a correlation or convariance structure to specify the dependence between observations within a cluster. Valid regression estimates can be obtained that do not depend on correct specification of the true correlation, but inappropriate specifications can result in a loss of efficiency. We derive general expressions for the asymptotic relative efficiency of GEE and GLS estimators under nested correlation structures. Efficiency is shown to depend on the covariate distribution, the cluster sizes, the response variable correlation, and the regression parameters. The results demonstrate that efficiency is quite sensitive to the between- and within-cluster variation of the covariates, and provide useful characterizations of models for which upper and lower efficiency bounds are attained. Efficiency losses for simple working correlation matrices, such as independence, can be large even for small to moderate correlations and cluster sizes.  相似文献   

14.
Statistical analysis of diving behavior data collected from satellite-linked dive recorders (SDKs) can be challenging because: (1) the data are binned into several depth and time categories, (2) the data from individual animals are often temporally autocorrelated, (3) random variation between individuals is common, and (4) the number of dives can be correlated among depth bins. Previous analyses often have ignored one or more of these statistical issues. In addition, previous SDR studies have focused on univariate analyses of index variables, rather than multivariate analyses of data from all depth bins. We describe multivariate analysis of SDR data using generalized estimating equations (GEE) and demonstrate the method using SDR data from harbor seals ( Phoca vitulina ) monitored in Prince William Sound, Alaska between 1992 and 1997. Multivariate regression provides greater opportunities for scientific inference than univariate methods, particularly in terms of depth resolution. In addition, empirical variance estimation makes GEE models somewhat easier to implement than other techniques that explicitly model all of the relevant components of variance. However, valid use of empirical variance estimation requires an adequate sample size of individual animals.  相似文献   

15.
Robust regression for clustered data with application to binary responses   总被引:3,自引:0,他引:3  
Preisser JS  Qaqish BF 《Biometrics》1999,55(2):574-579
Generalized estimating equations (GEE) can be highly influenced by the presence of unusual data points. A generalization of the GEE procedure, which yields parameter estimates and fitted values that are resistant to influential data, is introduced. Resistant generalized estimating equations (REGEE) include weights in the estimating equations to downweight influential observations or clusters. Influential observations are downweighted according to their leverage or residual in an example of correlated binary regression applied to 137 urinary incontinent elderly patients from 38 medical practices.  相似文献   

16.
A logistic regression with random effects model is commonly applied to analyze clustered binary data, and every cluster is assumed to have a different proportion of success. However, it could be of interest to obtain the proportion of success over clusters (i.e. the marginal proportion of success). Furthermore, the degree of correlation among data of the same cluster (intraclass correlation) is also a relevant concept to assess, but when using logistic regression with random effects it is not possible to get an analytical expression of the estimators for marginal proportion and intraclass correlation. In our paper, we assess and compare approaches using different kinds of approximations: based on the logistic‐normal mixed effects model (LN), linear mixed model (LMM), and generalized estimating equations (GEE). The comparisons are completed by using two real data examples and a simulation study. The results show the performance of the approaches strongly depends on the magnitude of the marginal proportion, the intraclass correlation, and the sample size. In general, the reliability of the approaches get worsen with low marginal proportion and large intraclass correlation. LMM and GEE approaches arises as reliable approaches when the sample size is large.  相似文献   

17.
The meta‐analysis of diagnostic accuracy studies is often of interest in screening programs for many diseases. The typical summary statistics for studies chosen for a diagnostic accuracy meta‐analysis are often two dimensional: sensitivities and specificities. The common statistical analysis approach for the meta‐analysis of diagnostic studies is based on the bivariate generalized linear‐mixed model (BGLMM), which has study‐specific interpretations. In this article, we present a population‐averaged (PA) model using generalized estimating equations (GEE) for making inference on mean specificity and sensitivity of a diagnostic test in the population represented by the meta‐analytic studies. We also derive the marginalized counterparts of the regression parameters from the BGLMM. We illustrate the proposed PA approach through two dataset examples and compare performance of estimators of the marginal regression parameters from the PA model with those of the marginalized regression parameters from the BGLMM through Monte Carlo simulation studies. Overall, both marginalized BGLMM and GEE with sandwich standard errors maintained nominal 95% confidence interval coverage levels for mean specificity and mean sensitivity in meta‐analysis of 25 of more studies even under misspecification of the covariance structure of the bivariate positive test counts for diseased and nondiseased subjects.  相似文献   

18.
Sutradhar BC  Das K 《Biometrics》2000,56(2):622-625
Liang and Zeger (1986, Biometrika 73, 13-22) introduced a generalized estimating equation (GEE) approach based on a working correlation matrix to obtain efficient estimators of regression parameters in the class of generalized linear models for repeated measures data. As demonstrated by Crowder (1995, Biometrika 82, 407-410), because of uncertainty of the definition of the working correlation matrix, the Liang-Zeger approach may, in some cases, lead to a complete breakdown of the estimation of the regression parameters. After taking this comment of Crowder into account, recently Sutradhar and Das (1999, Biometrika 86, 459-465) examined the loss of efficiency of the regression estimators due to misspecification of the correlation structures. But their study was confined to the regression estimation with cluster-level covariates, as in the original paper of Liang and Zeger. In this paper, we study this efficiency loss problem for the generalized regression models with within-cluster covariates by utilizing the approach of Sutradhar and Das (1999).  相似文献   

19.
This paper develops a general approach for dealing with parametric transformations of covariates for longitudinal data, where the responses are modeled marginally and generalized estimating equations (GEEs) are used for estimation of regression parameters. We propose an iterative algorithm for obtaining regression and transformation parameters from estimating equations, utilizing existing software for GEE problems. The algorithmic technique is closely related to that used in the Box-Tidwell transformation in classical linear regression, but we develop it under the GEE setting and for more general transformation functions. We provide supporting theorems for consistency and asymptotic Normality of the estimates. Inference between two nested models is also considered. This methodology is applied to two data sets. One consists of pill dissolution data, the other is taken from the Pittsburgh Youth Study (PYS). The PYS is a prospective longitudinal study of the development of delinquency, substance use, and mental health in male youth. We use the model-based parametric approach to examine the association between alcohol use at an early stage of adolescent development and delinquency over the course of adolescence.  相似文献   

20.
The current article explores whether the application of generalized linear models (GLM) and generalized estimating equations (GEE) can be used in place of conventional statistical analyses in the study of ordinal data that code an underlying continuous variable, like entheseal changes. The analysis of artificial data and ordinal data expressing entheseal changes in archaeological North African populations gave the following results. Parametric and nonparametric tests give convergent results particularly for P values <0.1, irrespective of whether the underlying variable is normally distributed or not under the condition that the samples involved in the tests exhibit approximately equal sizes. If this prerequisite is valid and provided that the samples are of equal variances, analysis of covariance may be adopted. GLM are not subject to constraints and give results that converge to those obtained from all nonparametric tests. Therefore, they can be used instead of traditional tests as they give the same amount of information as them, but with the advantage of allowing the study of the simultaneous impact of multiple predictors and their interactions and the modeling of the experimental data. However, GLM should be replaced by GEE for the study of bilateral asymmetry and in general when paired samples are tested, because GEE are appropriate for correlated data. Am J Phys Anthropol 153:473–483, 2014. © 2013 Wiley Periodicals, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号