首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we develop a Gaussian estimation (GE) procedure to estimate the parameters of a regression model for correlated (longitudinal) binary response data using a working correlation matrix. A two‐step iterative procedure is proposed for estimating the regression parameters by the GE method and the correlation parameters by the method of moments. Consistency properties of the estimators are discussed. A simulation study was conducted to compare 11 estimators of the regression parameters, namely, four versions of the GE, five versions of the generalized estimating equations (GEEs), and two versions of the weighted GEE. Simulations show that (i) the Gaussian estimates have the smallest mean square error and best coverage probability if the working correlation structure is correctly specified and (ii) when the working correlation structure is correctly specified, the GE and the GEE with exchangeable correlation structure perform best as opposed to when the correlation structure is misspecified.  相似文献   

2.
Wang YG  Lin X 《Biometrics》2005,61(2):413-421
The approach of generalized estimating equations (GEE) is based on the framework of generalized linear models but allows for specification of a working matrix for modeling within-subject correlations. The variance is often assumed to be a known function of the mean. This article investigates the impacts of misspecifying the variance function on estimators of the mean parameters for quantitative responses. Our numerical studies indicate that (1) correct specification of the variance function can improve the estimation efficiency even if the correlation structure is misspecified; (2) misspecification of the variance function impacts much more on estimators for within-cluster covariates than for cluster-level covariates; and (3) if the variance function is misspecified, correct choice of the correlation structure may not necessarily improve estimation efficiency. We illustrate impacts of different variance functions using a real data set from cow growth.  相似文献   

3.
Wang YG  Zhao Y 《Biometrics》2007,63(3):681-689
We consider the analysis of longitudinal data when the covariance function is modeled by additional parameters to the mean parameters. In general, inconsistent estimators of the covariance (variance/correlation) parameters will be produced when the "working" correlation matrix is misspecified, which may result in great loss of efficiency of the mean parameter estimators (albeit the consistency is preserved). We consider using different "working" correlation models for the variance and the mean parameters. In particular, we find that an independence working model should be used for estimating the variance parameters to ensure their consistency in case the correlation structure is misspecified. The designated "working" correlation matrices should be used for estimating the mean and the correlation parameters to attain high efficiency for estimating the mean parameters. Simulation studies indicate that the proposed algorithm performs very well. We also applied different estimation procedures to a data set from a clinical trial for illustration.  相似文献   

4.
Wang YG 《Biometrics》1999,55(4):1263-1268
We consider the problem of estimating a population size from successive catches taken during a removal experiment and propose two estimating functions approaches, the traditional quasi-likelihood (TQL) approach for dependent observations and the conditional quasi-likelihood (CQL) approach using the conditional mean and conditional variance of the catch given previous catches. Asymptotic covariance of the estimates and the relationship between the two methods are derived. Simulation results and application to the catch data from smallmouth bass show that the proposed estimating functions perform better than other existing methods, especially in the presence of overdispersion.  相似文献   

5.
GEE with Gaussian estimation of the correlations when data are incomplete   总被引:4,自引:0,他引:4  
This paper considers a modification of generalized estimating equations (GEE) for handling missing binary response data. The proposed method uses Gaussian estimation of the correlation parameters, i.e., the estimating function that yields an estimate of the correlation parameters is obtained from the multivariate normal likelihood. The proposed method yields consistent estimates of the regression parameters when data are missing completely at random (MCAR). However, when data are missing at random (MAR), consistency may not hold. In a simulation study with repeated binary outcomes that are missing at random, the magnitude of the potential bias that can arise is examined. The results of the simulation study indicate that, when the working correlation matrix is correctly specified, the bias is almost negligible for the modified GEE. In the simulation study, the proposed modification of GEE is also compared to the standard GEE, multiple imputation, and weighted estimating equations approaches. Finally, the proposed method is illustrated using data from a longitudinal clinical trial comparing two therapeutic treatments, zidovudine (AZT) and didanosine (ddI), in patients with HIV.  相似文献   

6.
Yin G  Cai J 《Biometrics》2005,61(1):151-161
As an alternative to the mean regression model, the quantile regression model has been studied extensively with independent failure time data. However, due to natural or artificial clustering, it is common to encounter multivariate failure time data in biomedical research where the intracluster correlation needs to be accounted for appropriately. For right-censored correlated survival data, we investigate the quantile regression model and adapt an estimating equation approach for parameter estimation under the working independence assumption, as well as a weighted version for enhancing the efficiency. We show that the parameter estimates are consistent and asymptotically follow normal distributions. The variance estimation using asymptotic approximation involves nonparametric functional density estimation. We employ the bootstrap and perturbation resampling methods for the estimation of the variance-covariance matrix. We examine the proposed method for finite sample sizes through simulation studies, and illustrate it with data from a clinical trial on otitis media.  相似文献   

7.
A genetic model for modified diallel crosses is proposed for estimating variance and covariance components of cytoplasmic, maternal additive and dominance effects, as well as direct additive and dominance effects. Monte Carlo simulations were conducted to compare the efficiencies of minimum norm quadratic unbiased estimation (MINQUE) methods. For both balanced and unbalanced mating designs, MINQUE (0/1), which has 0 for all the prior covariances and 1 for all the prior variances, has similar efficiency to MINQUE(), which has parameter values for the prior values. Unbiased estimates of variance and covariance components and their sampling variances could be obtained with MINQUE(0/1) and jackknifing. A t-test following jackknifing is applicable to test hypotheses for zero variance and covariance components. The genetic model is robust for estimating variance and covariance components under several situations of no specific effects. A MINQUE(0/1) procedure is suggested for unbiased estimation of covariance components between two traits with equal design matrices. Methods of unbiased prediction for random genetic effects are discussed. A linear unbiased prediction (LUP) method is shown to be efficient for the genetic model. An example is given for a demonstration of estimating variance and covariance components and predicting genetic effects.  相似文献   

8.
Chen H  Wang Y 《Biometrics》2011,67(3):861-870
In this article, we propose penalized spline (P-spline)-based methods for functional mixed effects models with varying coefficients. We decompose longitudinal outcomes as a sum of several terms: a population mean function, covariates with time-varying coefficients, functional subject-specific random effects, and residual measurement error processes. Using P-splines, we propose nonparametric estimation of the population mean function, varying coefficient, random subject-specific curves, and the associated covariance function that represents between-subject variation and the variance function of the residual measurement errors which represents within-subject variation. Proposed methods offer flexible estimation of both the population- and subject-level curves. In addition, decomposing variability of the outcomes as a between- and within-subject source is useful in identifying the dominant variance component therefore optimally model a covariance function. We use a likelihood-based method to select multiple smoothing parameters. Furthermore, we study the asymptotics of the baseline P-spline estimator with longitudinal data. We conduct simulation studies to investigate performance of the proposed methods. The benefit of the between- and within-subject covariance decomposition is illustrated through an analysis of Berkeley growth data, where we identified clearly distinct patterns of the between- and within-subject covariance functions of children's heights. We also apply the proposed methods to estimate the effect of antihypertensive treatment from the Framingham Heart Study data.  相似文献   

9.
The weights used in iterative weighted least squares (IWLS) regression are usually estimated parametrically using a working model for the error variance. When the variance function is misspecified, the IWLS estimates of the regression coefficients β are still asymptotically consistent but there is some loss in efficiency. Since second moments can be quite hard to model, it makes sense to estimate the error variances nonparametrically and to employ weights inversely proportional to the estimated variances in computing the WLS estimate for β. Surprisingly, this approach had not received much attention in the literature. The aim of this note is to demonstrate that such a procedure can be implemented easily in S-plus using standard functions with default options making it suitable for routine applications. The particular smoothing method that we use is local polynomial regression applied to the logarithm of the squared residuals but other smoothers can be tried as well. The proposed procedure is applied to data on the use of two different assay methods for a hormone. Efficiency calculations based on the estimated model show that the nonparametric IWLS estimates are more efficient than the parametric IWLS estimates based on three different plausible working models for the variance function. The proposed estimators also perform well in a simulation study using both parametric and nonparametric variance functions as well as normal and gamma errors.  相似文献   

10.
Since Liang and Zeger (1986) proposed the ‘generalized estimating equations’ approach for the estimation of regression parameters in models with correlated discrete responses, a lot of work has been devoted to the investigation of the properties of the corresponding GEE estimators. However, the effects of different kinds of covariates have often been overlooked. In this paper it is shown that the use of non-singular block invariant matrices of covariates, as e.g. a design matrix in an analysis of variance model, leads to GEE estimators which are identical regardless of the ‘working’ correlation matrix used. Moreover, they are efficient (McCullagh, 1983). If on the other hand only covariates are used which are invariant within blocks, the efficiency gain in choosing the ‘correct’ vs. an ‘incorrect’ correlation structure is shown to be negligible. The results of a simple simulation study suggest that although different GEE estimators are not identical and are not as efficient as a ML estimator, the differences are still negligible if both types of invariant covariates are present.  相似文献   

11.
This paper discusses a two‐state hidden Markov Poisson regression (MPR) model for analyzing longitudinal data of epileptic seizure counts, which allows for the rate of the Poisson process to depend on covariates through an exponential link function and to change according to the states of a two‐state Markov chain with its transition probabilities associated with covariates through a logit link function. This paper also considers a two‐state hidden Markov negative binomial regression (MNBR) model, as an alternative, by using the negative binomial instead of Poisson distribution in the proposed MPR model when there exists extra‐Poisson variation conditional on the states of the Markov chain. The two proposed models in this paper relax the stationary requirement of the Markov chain, allow for overdispersion relative to the usual Poisson regression model and for correlation between repeated observations. The proposed methodology provides a plausible analysis for the longitudinal data of epileptic seizure counts, and the MNBR model fits the data much better than the MPR model. Maximum likelihood estimation using the EM and quasi‐Newton algorithms is discussed. A Monte Carlo study for the proposed MPR model investigates the reliability of the estimation method, the choice of probabilities for the initial states of the Markov chain, and some finite sample behaviors of the maximum likelihood estimates, suggesting that (1) the estimation method is accurate and reliable as long as the total number of observations is reasonably large, and (2) the choice of probabilities for the initial states of the Markov process has little impact on the parameter estimates.  相似文献   

12.
Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.  相似文献   

13.
When clustered multinomial responses are fit using the generalized logistic link, Morel (1989) introduced a small sample correction in the Taylor series based estimator of the covariance matrix of the parameter estimates. The correction reduces the bias of the Type I error rates in small samples and guarantees positive definiteness of the estimated variance‐covariance matrix. It is well known that small sample bias in the use of the Delta method persists in any application of the Generalized Estimating Equations (GEE) methodology. In this article, we extend the correction originally suggested for the generalized logistic link, to other link functions and distributions, when parameters are estimated by GEE. In a Monte Carlo study with correlated data generated under different sampling schemes, the small sample correction has been shown to be effective in reducing the Type I error rates when the number of clusters is relatively small.  相似文献   

14.
Aitkin M 《Biometrics》1999,55(1):117-128
This paper describes an EM algorithm for nonparametric maximum likelihood (ML) estimation in generalized linear models with variance component structure. The algorithm provides an alternative analysis to approximate MQL and PQL analyses (McGilchrist and Aisbett, 1991, Biometrical Journal 33, 131-141; Breslow and Clayton, 1993; Journal of the American Statistical Association 88, 9-25; McGilchrist, 1994, Journal of the Royal Statistical Society, Series B 56, 61-69; Goldstein, 1995, Multilevel Statistical Models) and to GEE analyses (Liang and Zeger, 1986, Biometrika 73, 13-22). The algorithm, first given by Hinde and Wood (1987, in Longitudinal Data Analysis, 110-126), is a generalization of that for random effect models for overdispersion in generalized linear models, described in Aitkin (1996, Statistics and Computing 6, 251-262). The algorithm is initially derived as a form of Gaussian quadrature assuming a normal mixing distribution, but with only slight variation it can be used for a completely unknown mixing distribution, giving a straightforward method for the fully nonparametric ML estimation of this distribution. This is of value because the ML estimates of the GLM parameters can be sensitive to the specification of a parametric form for the mixing distribution. The nonparametric analysis can be extended straightforwardly to general random parameter models, with full NPML estimation of the joint distribution of the random parameters. This can produce substantial computational saving compared with full numerical integration over a specified parametric distribution for the random parameters. A simple method is described for obtaining correct standard errors for parameter estimates when using the EM algorithm. Several examples are discussed involving simple variance component and longitudinal models, and small-area estimation.  相似文献   

15.
Longitudinal studies are often applied in biomedical research and clinical trials to evaluate the treatment effect. The association pattern within the subject must be considered in both sample size calculation and the analysis. One of the most important approaches to analyze such a study is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which “working correlation structure” is introduced and the association pattern within the subject depends on a vector of association parameters denoted by ρ. The explicit sample size formulas for two‐group comparison in linear and logistic regression models are obtained based on the GEE method by Liu and Liang. For cluster randomized trials (CRTs), researchers proposed the optimal sample sizes at both the cluster and individual level as a function of sampling costs and the intracluster correlation coefficient (ICC). In these approaches, the optimal sample sizes depend strongly on the ICC. However, the ICC is usually unknown for CRTs and multicenter trials. To overcome this shortcoming, Van Breukelen et al. consider a range of possible ICC values identified from literature reviews and present Maximin designs (MMDs) based on relative efficiency (RE) and efficiency under budget and cost constraints. In this paper, the optimal sample size and number of repeated measurements using GEE models with an exchangeable working correlation matrix is proposed under the considerations of fixed budget, where “optimal” refers to maximum power for a given sampling budget. The equations of sample size and number of repeated measurements for a known parameter value ρ are derived and a straightforward algorithm for unknown ρ is developed. Applications in practice are discussed. We also discuss the existence of the optimal design when an AR(1) working correlation matrix is assumed. Our proposed method can be extended under the scenarios when the true and working correlation matrix are different.  相似文献   

16.
Heagerty PJ  Zeger SL 《Biometrics》2000,56(3):719-732
We develop semiparametric estimation methods for a pair of regressions that characterize the first and second moments of clustered discrete survival times. In the first regression, we represent discrete survival times through univariate continuation indicators whose expectations are modeled using a generalized linear model. In the second regression, we model the marginal pairwise association of survival times using the Clayton-Oakes cross-product ratio (Clayton, 1978, Biometrika 65, 141-151; Oakes, 1989, Journal of the American Statistical Association 84, 487-493). These models have recently been proposed by Shih (1998, Biometrics 54, 1115-1128). We relate the discrete survival models to multivariate multinomial models presented in Heagerty and Zeger (1996, Journal of the American Statistical Society 91, 1024-1036) and derive a paired estimating equations procedure that is computationally feasible for moderate and large clusters. We extend the work of Guo and Lin (1994, Biometrics 50, 632-639) and Shih (1998) to allow covariance weighted estimating equations and investigate the impact of weighting in terms of asymptotic relative efficiency. We demonstrate that the multinomial structure must be acknowledged when adopting weighted estimating equations and show that a naive use of GEE methods can lead to inconsistent parameter estimates. Finally, we illustrate the proposed methodology by analyzing psychological testing data previously summarized by TenHave and Uttal (1994, Applied Statistics 43, 371-384) and Guo and Lin (1994).  相似文献   

17.
This paper considers the impact of bias in the estimation of the association parameters for longitudinal binary responses when there are drop-outs. A number of different estimating equation approaches are considered for the case where drop-out cannot be assumed to be a completely random process. In particular, standard generalized estimating equations (GEE), GEE based on conditional residuals, GEE based on multivariate normal estimating equations for the covariance matrix, and second-order estimating equations (GEE2) are examined. These different GEE estimators are compared in terms of finite sample and asymptotic bias under a variety of drop-out processes. Finally, the relationship between bias in the estimation of the association parameters and bias in the estimation of the mean parameters is explored.  相似文献   

18.
Generalized linear models are a widely used method to obtain parametric estimates for the mean function. They have been further extended to allow the relationship between the mean function and the covariates to be more flexible via generalized additive models. However, the fixed variance structure can in many cases be too restrictive. The extended quasilikelihood (EQL) framework allows for estimation of both the mean and the dispersion/variance as functions of covariates. As for other maximum likelihood methods though, EQL estimates are not resistant to outliers: we need methods to obtain robust estimates for both the mean and the dispersion function. In this article, we obtain functional estimates for the mean and the dispersion that are both robust and smooth. The performance of the proposed method is illustrated via a simulation study and some real data examples.  相似文献   

19.
Diallel analysis for sex-linked and maternal effects   总被引:40,自引:0,他引:40  
Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.  相似文献   

20.
The assessment of population trends is a key point in wildlife conservation. Survey data collected over long period may not be comparable due to the presence of environmental biases (i.e. inadequate representation of the variability of environmental covariates in the study area). Moreover, count data may be affected by both overdispersion (i.e. the variance is larger than the mean) and excess of zero counts (potentially leading to zero inflation). The aim of this study was to define a modelling procedure to assess long-term population trends that addressed these three issues and to shed light on the effects of environmental bias, overdispersion, and zero inflation on trend estimates. To test our procedure, we used six bird species whose data were collected in northern Italy from 1992 to 2019. We designed a multi-step approach. First, using generalised additive models (GAMs), we implemented a full factorial design of models (eight models per species) taking or not into account the environmental bias (including or not including environmental covariates, respectively), overdispersion (using a negative binomial distribution or a Poisson distribution, respectively), and zero inflation (using or not using zero-inflated models, respectively). Models were ranked according to the Akaike Information Criterion. Second, annual population indices (median and 95% confidence interval of the number of breeding pairs per point count) were predicted through a parametric bootstrap procedure. Third, long-term population trends were assessed and tested for significance fitting weighted least square linear regression models to the predicted annual indices. To evaluate the effect of environmental bias, overdispersion, and zero inflation on trend estimates, an average discrepancy index was calculated for each model group. The results showed that environmental bias was the most important driver in determining different trend estimates, although overlooking overdispersion and zero inflation could lead to misleading results. For five species, zero-inflated GAMs resulted the best models to predict annual population indices. Our findings suggested a mutual interaction between zero inflation and overdispersion, with overdispersion arising in non-zero-inflated models. Moreover, for species having flocking foraging and/or colonial breeding behaviours, overdispersed and zero-inflated models may be more adequate. In conclusion, properly handling environmental bias, which may affect several data sets coming from long-term monitoring programs, is crucial to obtain reliable estimates of population trends. Furthermore, the extent to which overdispersion and zero inflation may affect trend estimates should be assessed by comparing different models, rather than presumed using statistical assumption.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号