首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A class of generalized linear mixed models can be obtained by introducing random effects in the linear predictor of a generalized linear model, e.g. a split plot model for binary data or count data. Maximum likelihood estimation, for normally distributed random effects, involves high-dimensional numerical integration, with severe limitations on the number and structure of the additional random effects. An alternative estimation procedure based on an extension of the iterative re-weighted least squares procedure for generalized linear models will be illustrated on a practical data set involving carcass classification of cattle. The data is analysed as overdispersed binomial proportions with fixed and random effects and associated components of variance on the logit scale. Estimates are obtained with standard software for normal data mixed models. Numerical restrictions pertain to the size of matrices to be inverted. This can be dealt with by absorption techniques familiar from e.g. mixed models in animal breeding. The final model fitted to the classification data includes four components of variance and a multiplicative overdispersion factor. Basically the estimation procedure is a combination of iterated least squares procedures and no full distributional assumptions are needed. A simulation study based on the classification data is presented. This includes a study of procedures for constructing confidence intervals and significance tests for fixed effects and components of variance. The simulation results increase confidence in the usefulness of the estimation procedure.  相似文献   

2.
The generalized negative binomial distribution has been found useful in fitting over-dispersed as well as under-dispersed count data. We define and study the generalized binomial regression model which is used to predict a count response variable affected by one or more explanatory variables. The methods of maximum likelihood and moments are given for estimating the model parameters. Approximate tests for the adequacy of the model are considered. The generalized binomial regression model has been applied to two observed data sets to which binomial regression model was applied earlier.  相似文献   

3.
We propose a generalization of the varying coefficient modelfor longitudinal data to cases where not only current but alsorecent past values of the predictor process affect current response.More precisely, the targeted regression coefficient functionsof the proposed model have sliding window supports around currenttime t. A variant of a recently proposed two-step estimationmethod for varying coefficient models is proposed for estimationin the context of these generalized varying coefficient models,and is found to lead to improvements, especially for the caseof additive measurement errors in both response and predictors.The proposed methodology for estimation and inference is alsoapplicable for the case of additive measurement error in thecommon versions of varying coefficient models that relate onlycurrent observations of predictor and response processes toeach other. Asymptotic distributions of the proposed estimatorsare derived, and the model is applied to the problem of predictingprotein concentrations in a longitudinal study. Simulation studiesdemonstrate the efficacy of the proposed estimation procedure.  相似文献   

4.
We prove that the generalized Poisson distribution GP(theta, eta) (eta > or = 0) is a mixture of Poisson distributions; this is a new property for a distribution which is the topic of the book by Consul (1989). Because we find that the fits to count data of the generalized Poisson and negative binomial distributions are often similar, to understand their differences, we compare the probability mass functions and skewnesses of the generalized Poisson and negative binomial distributions with the first two moments fixed. They have slight differences in many situations, but their zero-inflated distributions, with masses at zero, means and variances fixed, can differ more. These probabilistic comparisons are helpful in selecting a better fitting distribution for modelling count data with long right tails. Through a real example of count data with large zero fraction, we illustrate how the generalized Poisson and negative binomial distributions as well as their zero-inflated distributions can be discriminated.  相似文献   

5.
Carrasco JL  Jover L 《Biometrics》2003,59(4):849-858
The intraclass correlation coefficient (ICC) and the concordance correlation coefficient (CCC) are two of the most popular measures of agreement for variables measured on a continuous scale. Here, we demonstrate that ICC and CCC are the same measure of agreement estimated in two ways: by the variance components procedure and by the moment method. We propose estimating the CCC using variance components of a mixed effects model, instead of the common method of moments. With the variance components approach, the CCC can easily be extended to more than two observers, and adjusted using confounding covariates, by incorporating them in the mixed model. A simulation study is carried out to compare the variance components approach with the moment method. The importance of adjusting by confounding covariates is illustrated with a case example.  相似文献   

6.
Maps depicting cancer incidence rates have become useful tools in public health research, giving valuable information about the spatial variation in rates of disease. Typically, these maps are generated using count data aggregated over areas such as counties or census blocks. However, with the proliferation of geographic information systems and related databases, it is becoming easier to obtain exact spatial locations for the cancer cases and suitable control subjects. The use of such point data allows us to adjust for individual-level covariates, such as age and smoking status, when estimating the spatial variation in disease risk. Unfortunately, such covariate information is often subject to missingness. We propose a method for mapping cancer risk when covariates are not completely observed. We model these data using a logistic generalized additive model. Estimates of the linear and non-linear effects are obtained using a mixed effects model representation. We develop an EM algorithm to account for missing data and the random effects. Since the expectation step involves an intractable integral, we estimate the E-step with a Laplace approximation. This framework provides a general method for handling missing covariate values when fitting generalized additive models. We illustrate our method through an analysis of cancer incidence data from Cape Cod, Massachusetts. These analyses demonstrate that standard complete-case methods can yield biased estimates of the spatial variation of cancer risk.  相似文献   

7.
This article derives generalized prediction intervals for random effects in linear random‐effects models. For balanced and unbalanced data in two‐way layouts, models are considered with and without interaction. Coverage of the proposed generalized prediction intervals was estimated in a simulation study based on an agricultural field experiment. Generalized prediction intervals were compared with prediction intervals based on the restricted maximum likelihood (REML) procedure and the approximate methods of Satterthwaite and Kenward and Roger. The simulation study showed that coverage of generalized prediction intervals was closer to the nominal level 0.95 than coverage of prediction intervals based on the REML procedure.  相似文献   

8.
Abstract. Statistical models of the realized niche of species are increasingly used, but systematic comparisons of alternative methods are still limited. In particular, only few studies have explored the effect of scale in model outputs. In this paper, we investigate the predictive ability of three statistical methods (generalized linear models, generalized additive models and classification tree analysis) using species distribution data at three scales: fine (Catalonia), intermediate (Portugal) and coarse (Europe). Four Mediterranean tree species were modelled for comparison. Variables selected by models were relatively consistent across scales and the predictive accuracy of models varied only slightly. However, there were slight differences in the performance of methods. Classification tree analysis had a lower accuracy than the generalized methods, especially at finer scales. The performance of generalized linear models also increased with scale. At the fine scale GLM with linear terms showed better accuracy than GLM with quadratic and polynomial terms. This is probably because distributions at finer scales represent a linear sub‐sample of entire realized niches of species. In contrast to GLM, the performance of GAM was constant across scales being more data‐oriented. The predictive accuracy of GAM was always at least equal to other techniques, suggesting that this modelling approach is more robust to variations of scale because it can deal with any response shape.  相似文献   

9.
Chen Z  Liu J 《Biometrics》2009,65(2):470-477
Summary .  Quantitative trait loci mapping in experimental organisms is of great scientific and economic importance. There has been a rapid advancement in statistical methods for quantitative trait loci mapping. Various methods for normally distributed traits have been well established. Some of them have also been adapted for other types of traits such as binary, count, and categorical traits. In this article, we consider a unified mixture generalized linear model (GLIM) for multiple interval mapping in experimental crosses. The multiple interval mapping approach was proposed by Kao, Zeng, and Teasdale (1999, Genetics 152, 1203–1216) for normally distributed traits. However, its application to nonnormally distributed traits has been hindered largely by the lack of an efficient computation algorithm and an appropriate mapping procedure. In this article, an effective expectation–maximization algorithm for the computation of the mixture GLIM and an epistasis-effect-adjusted multiple interval mapping procedure is developed. A real data set, Radiata Pine data, is analyzed and the data structure is used in simulation studies to demonstrate the desirable features of the developed method.  相似文献   

10.
Many distributions have been used in flood frequency analysis (FFA) for fitting the flood extremes data. However, as shown in the paper, the scatter of Polish data plotted on the moment ratio diagram shows that there is still room for a new model. In the paper, we study the usefulness of the generalized exponential (GE) distribution in flood frequency analysis for Polish Rivers. We investigate the fit of GE distribution to the Polish data of the maximum flows in comparison with the inverse Gaussian (IG) distribution, which in our previous studies showed the best fitting among several models commonly used in FFA. Since the use of a discrimination procedure without the knowledge of its performance for the considered probability density functions may lead to erroneous conclusions, we compare the probability of correct selection for the GE and IG distributions along with the analysis of the asymptotic model error in respect to the upper quantile values. As an application, both GE and IG distributions are alternatively assumed for describing the annual peak flows for several gauging stations of Polish Rivers. To find the best fitting model, four discrimination procedures are used. In turn, they are based on the maximized logarithm of the likelihood function (K procedure), on the density function of the scale transformation maximal invariant (QK procedure), on the Kolmogorov-Smirnov statistics (KS procedure) and the fourth procedure based on the differences between the ML estimate of 1% quantile and its value assessed by the method of moments and linear moments, in sequence (R procedure). Due to the uncertainty of choosing the best model, the method of aggregation is applied to estimate of the maximum flow quantiles.  相似文献   

11.
Yau KK 《Biometrics》2001,57(1):96-102
A method for modeling survival data with multilevel clustering is described. The Cox partial likelihood is incorporated into the generalized linear mixed model (GLMM) methodology. Parameter estimation is achieved by maximizing a log likelihood analogous to the likelihood associated with the best linear unbiased prediction (BLUP) at the initial step of estimation and is extended to obtain residual maximum likelihood (REML) estimators of the variance component. Estimating equations for a three-level hierarchical survival model are developed in detail, and such a model is applied to analyze a set of chronic granulomatous disease (CGD) data on recurrent infections as an illustration with both hospital and patient effects being considered as random. Only the latter gives a significant contribution. A simulation study is carried out to evaluate the performance of the REML estimators. Further extension of the estimation procedure to models with an arbitrary number of levels is also discussed.  相似文献   

12.
By using deviance standardized residuals, the seemingly unrelated regression estimation procedure is extended to generalized linear models, and fitted by an iterative procedure. The matrix of cross products of standardized residuals is asymptotically multivariate normal, and can be used for further multivariate analyses and for hypothesis testing.  相似文献   

13.
A popular way to represent clustered binary, count, or other data is via the generalized linear mixed model framework, which accommodates correlation through incorporation of random effects. A standard assumption is that the random effects follow a parametric family such as the normal distribution; however, this may be unrealistic or too restrictive to represent the data. We relax this assumption and require only that the distribution of random effects belong to a class of 'smooth' densities and approximate the density by the seminonparametric (SNP) approach of Gallant and Nychka (1987). This representation allows the density to be skewed, multi-modal, fat- or thin-tailed relative to the normal and includes the normal as a special case. Because an efficient algorithm to sample from an SNP density is available, we propose a Monte Carlo EM algorithm using a rejection sampling scheme to estimate the fixed parameters of the linear predictor, variance components and the SNP density. The approach is illustrated by application to a data set and via simulation.  相似文献   

14.
We developed a generalized linear model of QTL mapping for discrete traits in line crossing experiments. Parameter estimation was achieved using two different algorithms, a mixture model-based EM (expectation–maximization) algorithm and a GEE (generalized estimating equation) algorithm under a heterogeneous residual variance model. The methods were developed using ordinal data, binary data, binomial data and Poisson data as examples. Applications of the methods to simulated as well as real data are presented. The two different algorithms were compared in the data analyses. In most situations, the two algorithms were indistinguishable, but when large QTL are located in large marker intervals, the mixture model-based EM algorithm can fail to converge to the correct solutions. Both algorithms were coded in C++ and interfaced with SAS as a user-defined SAS procedure called PROC QTL.  相似文献   

15.
Abstract. Generalized additive models (GAMs) are a non-parametric extension of generalized linear models (GLMs). They are introduced here as an exploratory tool in the analysis of species distributions with respect to climate. An important result is that the long-debated question of whether a response curve, in one dimension, is actually symmetric and bell-shaped or not, can be tested using GAMs. GAMs and GLMs are discussed and are illustrated by three examples using binary data. A grey-scale plot of one of the fits is constructed to indicate which areas on a map seem climatically suitable for that species. This is useful for species introductions. Further applications are mentioned.  相似文献   

16.
Generalized causal mediation analysis   总被引:1,自引:0,他引:1  
Albert JM  Nelson S 《Biometrics》2011,67(3):1028-1038
The goal of mediation analysis is to assess direct and indirect effects of a treatment or exposure on an outcome. More generally, we may be interested in the context of a causal model as characterized by a directed acyclic graph (DAG), where mediation via a specific path from exposure to outcome may involve an arbitrary number of links (or "stages"). Methods for estimating mediation (or pathway) effects are available for a continuous outcome and a continuous mediator related via a linear model, while for a categorical outcome or categorical mediator, methods are usually limited to two-stage mediation. We present a method applicable to multiple stages of mediation and mixed variable types using generalized linear models. We define pathway effects using a potential outcomes framework and present a general formula that provides the effect of exposure through any specified pathway. Some pathway effects are nonidentifiable and their estimation requires an assumption regarding the correlation between counterfactuals. We provide a sensitivity analysis to assess the impact of this assumption. Confidence intervals for pathway effect estimates are obtained via a bootstrap method. The method is applied to a cohort study of dental caries in very low birth weight adolescents. A simulation study demonstrates low bias of pathway effect estimators and close-to-nominal coverage rates of confidence intervals. We also find low sensitivity to the counterfactual correlation in most scenarios.  相似文献   

17.
This paper is concerned with the analysis of count data with special reference to experimental biology and agricultural research. The model considered in this paper is obtained by extending a generalized linear model by introducing random effects with associated variance components on the scale of the linear predictor. Maximum likelihood estimation is discussed and compared with a method which uses a simplified version of the likelihood equations. Two practical applications are used to illustrate the methods.  相似文献   

18.
On occasion, generalized linear models for counts based on Poisson or overdispersed count distributions may encounter lack of fit due to disproportionately large frequencies of zeros. Three alternative types of regression models that utilize all the information and explicitly account for excess zeros are examined and given general formulations. A simple mechanism for added zeros is assumed that directly motivates one type of model, here called the added-zero type, particular forms of which have been proposed independently by D. LAMBERT (1992) and in unpublished work by the author. An original regression formulation (the zero-altered model) is presented as a reduced form of the two-part model for count data, which is also discussed. It is suggested that two-part models be used to aid in development of an added-zero model when the latter is thought to be appropriate.  相似文献   

19.
广义岭回归在家禽育种值估计中的应用   总被引:4,自引:1,他引:3  
讨论了岭回归方法应用于混合线性模型方程组中估计家禽育种值的方法,其实质是将传统的混合线性模型方程组理解为一种广义岭回归估计,为确定遗传参数的估计提供了一种途径;同时,以番鸭为例,考虑了一个性状和两个固定效应,采用广义岭回归法对公番鸭育种值进行了估计,并与最佳线性无偏预测法(BLUP 法)进行了比较,结果表明,广义岭回归方法和BLUP 法估计的育种值及其排序非常接近,其相关系数和秩相关系数分别达到了0.998~(**)和0.986~(**),且采用广义岭回归法预测的误差率低(在±10%以内);表明在混合线性模型方程组中使用广义岭回归估计动物育种值的方法具有可行性,并可省去估计遗传参数的过程,使BLUP 法在动物选育中的应用更具实用性.  相似文献   

20.

Introduction

With the renewed drive towards malaria elimination, there is a need for improved surveillance tools. While time series analysis is an important tool for surveillance, prediction and for measuring interventions’ impact, approximations by commonly used Gaussian methods are prone to inaccuracies when case counts are low. Therefore, statistical methods appropriate for count data are required, especially during “consolidation” and “pre-elimination” phases.

Methods

Generalized autoregressive moving average (GARMA) models were extended to generalized seasonal autoregressive integrated moving average (GSARIMA) models for parsimonious observation-driven modelling of non Gaussian, non stationary and/or seasonal time series of count data. The models were applied to monthly malaria case time series in a district in Sri Lanka, where malaria has decreased dramatically in recent years.

Results

The malaria series showed long-term changes in the mean, unstable variance and seasonality. After fitting negative-binomial Bayesian models, both a GSARIMA and a GARIMA deterministic seasonality model were selected based on different criteria. Posterior predictive distributions indicated that negative-binomial models provided better predictions than Gaussian models, especially when counts were low. The G(S)ARIMA models were able to capture the autocorrelation in the series.

Conclusions

G(S)ARIMA models may be particularly useful in the drive towards malaria elimination, since episode count series are often seasonal and non-stationary, especially when control is increased. Although building and fitting GSARIMA models is laborious, they may provide more realistic prediction distributions than do Gaussian methods and may be more suitable when counts are low.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号