首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
GEE with Gaussian estimation of the correlations when data are incomplete   总被引:4,自引:0,他引:4  
This paper considers a modification of generalized estimating equations (GEE) for handling missing binary response data. The proposed method uses Gaussian estimation of the correlation parameters, i.e., the estimating function that yields an estimate of the correlation parameters is obtained from the multivariate normal likelihood. The proposed method yields consistent estimates of the regression parameters when data are missing completely at random (MCAR). However, when data are missing at random (MAR), consistency may not hold. In a simulation study with repeated binary outcomes that are missing at random, the magnitude of the potential bias that can arise is examined. The results of the simulation study indicate that, when the working correlation matrix is correctly specified, the bias is almost negligible for the modified GEE. In the simulation study, the proposed modification of GEE is also compared to the standard GEE, multiple imputation, and weighted estimating equations approaches. Finally, the proposed method is illustrated using data from a longitudinal clinical trial comparing two therapeutic treatments, zidovudine (AZT) and didanosine (ddI), in patients with HIV.  相似文献   

2.
Wang YG 《Biometrics》1999,55(3):984-989
Troxel, Lipsitz, and Brennan (1997, Biometrics 53, 857-869) considered parameter estimation from survey data with nonignorable nonresponse and proposed weighted estimating equations to remove the biases in the complete-case analysis that ignores missing observations. This paper suggests two alternative modifications for unbiased estimation of regression parameters when a binary outcome is potentially observed at successive time points. The weighting approach of Robins, Rotnitzky, and Zhao (1995, Journal of the American Statistical Association 90, 106-121) is also modified to obtain unbiased estimating functions. The suggested estimating functions are unbiased only when the missingness probability is correctly specified, and misspecification of the missingness model will result in biases in the estimates. Simulation studies are carried out to assess the performance of different methods when the covariate is binary or normal. For the simulation models used, the relative efficiency of the two new methods to the weighting methods is about 3.0 for the slope parameter and about 2.0 for the intercept parameter when the covariate is continuous and the missingness probability is correctly specified. All methods produce substantial biases in the estimates when the missingness model is misspecified or underspecified. Analysis of data from a medical survey illustrates the use and possible differences of these estimating functions.  相似文献   

3.
Wang CY  Huang WT 《Biometrics》2000,56(1):98-105
We consider estimation in logistic regression where some covariate variables may be missing at random. Satten and Kupper (1993, Journal of the American Statistical Association 88, 200-208) proposed estimating odds ratio parameters using methods based on the probability of exposure. By approximating a partial likelihood, we extend their idea and propose a method that estimates the cumulant-generating function of the missing covariate given observed covariates and surrogates in the controls. Our proposed method first estimates some lower order cumulants of the conditional distribution of the unobserved data and then solves a resulting estimating equation for the logistic regression parameter. A simple version of the proposed method is to replace a missing covariate by the summation of its conditional mean and conditional variance given observed data in the controls. We note that one important property of the proposed method is that, when the validation is only on controls, a class of inverse selection probability weighted semiparametric estimators cannot be applied because selection probabilities on cases are zeroes. The proposed estimator performs well unless the relative risk parameters are large, even though it is technically inconsistent. Small-sample simulations are conducted. We illustrate the method by an example of real data analysis.  相似文献   

4.
The purpose of this investigation was to determine the validity of the non-exercise-based equations of Davis et al. (13), Jones et al. (20), and Neder et al. (30) for estimating the ventilatory threshold (VT) in samples of aerobically trained men and women. One hundred and forty-four aerobically trained men (mean +/- SD age, 41.0 +/- 11.6 years; N = 83) and women (37.1 +/- 9.0 years, N = 61) performed a maximal incremental test to determine VO2max and observed VT on a cycle ergometer. The observed VT was determined by gas exchange measurements using the V-slope method (VCO2/VO2) in conjunction with analyses of the ventilatory equivalents (i.e., minute ventilation VE/VO2 and VE/VCO2) and end-tidal gas tensions (i.e., P(ET)O2 and P(ET)CO2) for oxygen and carbon dioxide. The predicted VT values from 14 equations were compared to the observed VT values by examining the constant error (CE), standard error of estimate (SEE), Pearson correlation coefficient (r), and total error (TE). The results of this investigation indicated that all 14 equations resulted in significant (p < 0.008) CE values ranging from 1.13 to 1.72 L x min(-1) for the men and from 0.58 to 1.12 L x min(-1) for the women. Furthermore, the SEE, r, and TE values ranged from 0.37 to 0.54, from 0.36 to 0.53, and from 0.68 to 1.81 L x min(-1), respectively. The lowest TE values for the men and women represented 45 and 36% of the mean of the observed VT values, respectively. The results of this study indicated that the errors associated with all 14 equations were too large to be of practical value for estimating VT in aerobically trained men and women.  相似文献   

5.
Models for longitudinal data are employed in a wide range of behavioral, biomedical, psychosocial, and health‐care‐related research. One popular model for continuous response is the linear mixed‐effects model (LMM). Although simulations by recent studies show that LMM provides reliable estimates under departures from the normality assumption for complete data, the invariable occurrence of missing data in practical studies renders such robustness results less useful when applied to real study data. In this paper, we show by simulated studies that in the presence of missing data estimates of the fixed effect of LMM are biased under departures from normality. We discuss two robust alternatives, the weighted generalized estimating equations (WGEE) and the augmented WGEE (AWGEE), and compare their performances with LMM using real as well as simulated data. Our simulation results show that both WGEE and AWGEE provide valid inference for skewed non‐normal data when missing data follows the missing at random, the most popular missing data mechanism for real study data.  相似文献   

6.
New tests for trend in proportions, in the presence of historical control data, are proposed. One such test is a simple score statistic based on a binomial likelihood for the "current" study and beta-binomial likelihoods for each historical control series. A closely related trend statistic based on estimating equations is also proposed. Trend statistics that allow overdispersed proportions in the current study are also developed, including a version of Tarone's (1982, Biometrics 38, 215-220) test that acknowledges sampling variation in the beta distribution parameters, and a trend statistic based on estimating equations. Each such trend test is evaluated with respect to size and power under both binomial and beta-binomial sampling conditions for the current study, and illustrations are provided.  相似文献   

7.
Sensitivity and specificity are common measures used to evaluate the performance of a diagnostic test. A diagnostic test is often administrated at a subunit level, e.g. at the level of vessel, ear or eye of a patient so that the treatment can be targeted at the specific subunit. Therefore, it is essential to evaluate the diagnostic test at the subunit level. Often patients with more negative subunit test results are less likely to receive the gold standard tests than patients with more positive subunit test results. To account for this type of missing data and correlation between subunit test results, we proposed a weighted generalized estimating equations (WGEE) approach to evaluate subunit sensitivities and specificities. A simulation study was conducted to evaluate the performance of the WGEE estimators and the weighted least squares (WLS) estimators (Barnhart and Kosinski, 2003) under a missing at random assumption. The results suggested that WGEE estimator is consistent under various scenarios of percentage of missing data and sample size, while the WLS approach could yield biased estimators due to a misspecified missing data mechanism. We illustrate the methodology with a cardiology example.  相似文献   

8.
Imputation, weighting, direct likelihood, and direct Bayesian inference (Rubin, 1976) are important approaches for missing data regression. Many useful semiparametric estimators have been developed for regression analysis of data with missing covariates or outcomes. It has been established that some semiparametric estimators are asymptotically equivalent, but it has not been shown that many are numerically the same. We applied some existing methods to a bladder cancer case-control study and noted that they were the same numerically when the observed covariates and outcomes are categorical. To understand the analytical background of this finding, we further show that when observed covariates and outcomes are categorical, some estimators are not only asymptotically equivalent but also actually numerically identical. That is, although their estimating equations are different, they lead numerically to exactly the same root. This includes a simple weighted estimator, an augmented weighted estimator, and a mean-score estimator. The numerical equivalence may elucidate the relationship between imputing scores and weighted estimation procedures.  相似文献   

9.
Donner A  Klar N  Zou G 《Biometrics》2004,60(4):919-925
Split-cluster designs are frequently used in the health sciences when naturally occurring clusters such as multiple sites or organs in the same subject are assigned to different treatments. However, statistical methods for the analysis of binary data arising from such designs are not well developed. The purpose of this article is to propose and evaluate a new procedure for testing the equality of event rates in a design dividing each of k clusters into two segments having multiple sites (e.g., teeth, lesions). The test statistic proposed is a generalization of a previously published procedure based on adjusting the standard Pearson chi-square statistic, but can also be derived as a score test using the approach of generalized estimating equations.  相似文献   

10.
A dynamic programming filter which provides estimates of the first and second derivative of empirical displacement data is investigated numerically. This filter uses a weighted least squares criteria in estimating the derivatives. The filter equations are presented together with several numerical examples. These examples are taken from references that proposed other techniques.  相似文献   

11.
Clinical studies are often concerned with assessing whether different raters/methods produce similar values for measuring a quantitative variable. Use of the concordance correlation coefficient as a measure of reproducibility has gained popularity in practice since its introduction by Lin (1989, Biometrics 45, 255-268). Lin's method is applicable for studies evaluating two raters/two methods without replications. Chinchilli et al. (1996, Biometrics 52, 341-353) extended Lin's approach to repeated measures designs by using a weighted concordance correlation coefficient. However, the existing methods cannot easily accommodate covariate adjustment, especially when one needs to model agreement. In this article, we propose a generalized estimating equations (GEE) approach to model the concordance correlation coefficient via three sets of estimating equations. The proposed approach is flexible in that (1) it can accommodate more than two correlated readings and test for the equality of dependent concordant correlation estimates; (2) it can incorporate covariates predictive of the marginal distribution; (3) it can be used to identify covariates predictive of concordance correlation; and (4) it requires minimal distribution assumptions. A simulation study is conducted to evaluate the asymptotic properties of the proposed approach. The method is illustrated with data from two biomedical studies.  相似文献   

12.
A technique for obtaining unbiased estimates of genetic parameters (allelic frequencies of RAPD loci, heterozygosity (H), Wright's F statistic, and Nei's genetic distances) in populations of the European (Capreolus capreolus L.) and Siberian (Capreolus pygargus Pall.) roe deer is presented. The technique employs jackknifing and multiple comparative analysis based on a modified Holmes's procedure for Bonferroni's test. It was demonstrated that samples from local groups of roe deer in the Trans-Ural region did not differ significantly in allelic frequencies (0.8, 0.81, and 0.78; P > 0.447) or Nei's genetic distances (0.0056, 0.0273, and 0.0218; P = 0.26), but they could be differentiated based on Wright's F statistic (0.0346, 0.0519, and 0.0450; P = 10(-9)). The parameters of intrapopulation heterozygosity (from 0.18 to 0.042) formed a gradient from the east to the west. Calibration estimates of molecular evolution rate in the family Cervidae obtained based on published data and Jukes-Cantor genetic distances estimated in this study demonstrated that the Siberian roe deer has split into two subspecies, C. pygargus pygargus Pall. and C. pygargus tianschanicus Satunin in the interval between 229 and 462.3 thousand years ago. The species formation of the Siberian and European roe deer was dated between 1.375 and 2.75 Myr ago. Based on the results obtained we recommend the approaches used in the study for analysis of population genetic structure and phylogenetic relationships between populations, subspecies, species, and higher taxa.  相似文献   

13.
This paper presents a method for analysing longitudinal data when there are dropouts. In particular, we develop a simple method based on generalized linear mixture models for handling nonignorable dropouts for a variety of discrete and continuous outcomes. Statistical inference for the model parameters is based on a generalized estimating equations (GEE) approach (Liang and Zeger, 1986). The proposed method yields estimates of the model parameters that are valid when nonresponse is nonignorable under a variety of assumptions concerning the dropout process. Furthermore, the proposed method can be implemented using widely available statistical software. Finally, an example using data from a clinical trial of contracepting women is used to illustrate the methodology.  相似文献   

14.
This paper considers the impact of bias in the estimation of the association parameters for longitudinal binary responses when there are drop-outs. A number of different estimating equation approaches are considered for the case where drop-out cannot be assumed to be a completely random process. In particular, standard generalized estimating equations (GEE), GEE based on conditional residuals, GEE based on multivariate normal estimating equations for the covariance matrix, and second-order estimating equations (GEE2) are examined. These different GEE estimators are compared in terms of finite sample and asymptotic bias under a variety of drop-out processes. Finally, the relationship between bias in the estimation of the association parameters and bias in the estimation of the mean parameters is explored.  相似文献   

15.
Heagerty PJ  Zeger SL 《Biometrics》2000,56(3):719-732
We develop semiparametric estimation methods for a pair of regressions that characterize the first and second moments of clustered discrete survival times. In the first regression, we represent discrete survival times through univariate continuation indicators whose expectations are modeled using a generalized linear model. In the second regression, we model the marginal pairwise association of survival times using the Clayton-Oakes cross-product ratio (Clayton, 1978, Biometrika 65, 141-151; Oakes, 1989, Journal of the American Statistical Association 84, 487-493). These models have recently been proposed by Shih (1998, Biometrics 54, 1115-1128). We relate the discrete survival models to multivariate multinomial models presented in Heagerty and Zeger (1996, Journal of the American Statistical Society 91, 1024-1036) and derive a paired estimating equations procedure that is computationally feasible for moderate and large clusters. We extend the work of Guo and Lin (1994, Biometrics 50, 632-639) and Shih (1998) to allow covariance weighted estimating equations and investigate the impact of weighting in terms of asymptotic relative efficiency. We demonstrate that the multinomial structure must be acknowledged when adopting weighted estimating equations and show that a naive use of GEE methods can lead to inconsistent parameter estimates. Finally, we illustrate the proposed methodology by analyzing psychological testing data previously summarized by TenHave and Uttal (1994, Applied Statistics 43, 371-384) and Guo and Lin (1994).  相似文献   

16.
Wang CY 《Biometrics》2000,56(1):106-112
Consider the problem of estimating the correlation between two nutrient measurements, such as the percent energy from fat obtained from a food frequency questionnaire (FFQ) and that from repeated food records or 24-hour recalls. Under a classical additive model for repeated food records, it is known that there is an attenuation effect on the correlation estimation if the sample average of repeated food records for each subject is used to estimate the underlying long-term average. This paper considers the case in which the selection probability of a subject for participation in the calibration study, in which repeated food records are measured, depends on the corresponding FFQ value, and the repeated longitudinal measurement errors have an autoregressive structure. This paper investigates a normality-based estimator and compares it with a simple method of moments. Both methods are consistent if the first two moments of nutrient measurements exist. Furthermore, joint estimating equations are applied to estimate the correlation coefficient and related nuisance parameters simultaneously. This approach provides a simple sandwich formula for the covariance estimation of the estimator. Finite sample performance is examined via a simulation study, and the proposed weighted normality-based estimator performs well under various distributional assumptions. The methods are applied to real data from a dietary assessment study.  相似文献   

17.
Pawel, D. J., Preston, D. L., Pierce, D. A. and Cologne, J. B. Improved Estimates of Cancer Site-Specific Risks for A-Bomb Survivors. Radiat. Res. 169, 87-98 (2008). Simple methods are investigated for improving summary site-specific radiogenic risk estimates. Estimates in this report are derived from cancer incidence data from the Life Span Study (LSS) cohort of A-bomb survivors that are followed up by the Radiation Effects Research Foundation (RERF). Estimates from the LSS of excess relative risk (ERR) for solid cancer sites have typically been derived separately for each site. Even though the data for this are extensive, the statistical imprecision in site-specific (organ-specific) risk estimates is substantial, and it is clear that a large portion of the site-specific variation in estimates is due to this imprecision. Empirical Bayes (EB) estimates offer a reasonable approach for moderating this variation. The simple version of EB estimates that we applied to the LSS data are weighted averages of a pooled overall estimate of ERR and separately derived site-specific estimates, with weights determined by the data. Results indicate that the EB estimates are most useful for sites such as esophageal or bladder cancer, for which the separately derived ERR estimates are less precise than for other sites.  相似文献   

18.
Liu D  Zhou XH 《Biometrics》2011,67(3):906-916
Covariate-specific receiver operating characteristic (ROC) curves are often used to evaluate the classification accuracy of a medical diagnostic test or a biomarker, when the accuracy of the test is associated with certain covariates. In many large-scale screening tests, the gold standard is subject to missingness due to high cost or harmfulness to the patient. In this article, we propose a semiparametric estimation of the covariate-specific ROC curves with a partial missing gold standard. A location-scale model is constructed for the test result to model the covariates' effect, but the residual distributions are left unspecified. Thus the baseline and link functions of the ROC curve both have flexible shapes. With the gold standard missing at random (MAR) assumption, we consider weighted estimating equations for the location-scale parameters, and weighted kernel estimating equations for the residual distributions. Three ROC curve estimators are proposed and compared, namely, imputation-based, inverse probability weighted, and doubly robust estimators. We derive the asymptotic normality of the estimated ROC curve, as well as the analytical form of the standard error estimator. The proposed method is motivated and applied to the data in an Alzheimer's disease research.  相似文献   

19.
The classical normal-theory tests for testing the null hypothesis of common variance and the classical estimates of scale have long been known to be quite nonrobust to even mild deviations from normality assumptions for moderate sample sizes. Levene (1960) suggested a one-way ANOVA type statistic as a robust test. Brown and Forsythe (1974) considered a modified version of Levene's test by replacing the sample means with sample medians as estimates of population locations, and their test is computationally the simplest among the three tests recommended by Conover , Johnson , and Johnson (1981) in terms of robustness and power. In this paper a new robust and powerful test for homogeneity of variances is proposed based on a modification of Levene's test using the weighted likelihood estimates (Markatou , Basu , and Lindsay , 1996) of the population means. For two and three populations the proposed test using the Hellinger distance based weighted likelihood estimates is observed to achieve better empirical level and power than Brown-Forsythe's test in symmetric distributions having a thicker tail than the normal, and higher empirical power in skew distributions under the use of F distribution critical values.  相似文献   

20.
Chen B  Zhou XH 《Biometrics》2011,67(3):830-842
Longitudinal studies often feature incomplete response and covariate data. Likelihood-based methods such as the expectation-maximization algorithm give consistent estimators for model parameters when data are missing at random (MAR) provided that the response model and the missing covariate model are correctly specified; however, we do not need to specify the missing data mechanism. An alternative method is the weighted estimating equation, which gives consistent estimators if the missing data and response models are correctly specified; however, we do not need to specify the distribution of the covariates that have missing values. In this article, we develop a doubly robust estimation method for longitudinal data with missing response and missing covariate when data are MAR. This method is appealing in that it can provide consistent estimators if either the missing data model or the missing covariate model is correctly specified. Simulation studies demonstrate that this method performs well in a variety of situations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号