首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Hall DB 《Biometrics》2000,56(4):1030-1039
In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count response variable is assumed to be distributed as a mixture of a Poisson(lambda) distribution and a distribution with point mass of one at zero, with mixing probability p. Both p and lambda are allowed to depend on covariates through canonical link generalized linear models. In this paper, we adapt Lambert's methodology to an upper bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model. In addition, we add to the flexibility of these fixed effects models by incorporating random effects so that, e.g., the within-subject correlation and between-subject heterogeneity typical of repeated measures data can be accommodated. We motivate, develop, and illustrate the methods described here with an example from horticulture, where both upper bounded count (binomial-type) and unbounded count (Poisson-type) data with excess zeros were collected in a repeated measures designed experiment.  相似文献   

2.
3.
Toledano AY  Gatsonis C 《Biometrics》1999,55(2):488-496
We propose methods for regression analysis of repeatedly measured ordinal categorical data when there is nonmonotone missingness in these responses and when a key covariate is missing depending on observables. The methods use ordinal regression models in conjunction with generalized estimating equations (GEEs). We extend the GEE methodology to accommodate arbitrary patterns of missingness in the responses when this missingness is independent of the unobserved responses. We further extend the methodology to provide correction for possible bias when missingness in knowledge of a key covariate may depend on observables. The approach is illustrated with the analysis of data from a study in diagnostic oncology in which multiple correlated receiver operating characteristic curves are estimated and corrected for possible verification bias when the true disease status is missing depending on observables.  相似文献   

4.
An estimation method for the semiparametric mixed effects model   总被引:6,自引:0,他引:6  
Tao H  Palta M  Yandell BS  Newton MA 《Biometrics》1999,55(1):102-110
A semiparametric mixed effects regression model is proposed for the analysis of clustered or longitudinal data with continuous, ordinal, or binary outcome. The common assumption of Gaussian random effects is relaxed by using a predictive recursion method (Newton and Zhang, 1999) to provide a nonparametric smooth density estimate. A new strategy is introduced to accelerate the algorithm. Parameter estimates are obtained by maximizing the marginal profile likelihood by Powell's conjugate direction search method. Monte Carlo results are presented to show that the method can improve the mean squared error of the fixed effects estimators when the random effects distribution is not Gaussian. The usefulness of visualizing the random effects density itself is illustrated in the analysis of data from the Wisconsin Sleep Survey. The proposed estimation procedure is computationally feasible for quite large data sets.  相似文献   

5.
6.
7.
Summary .   A common and important problem in clustered sampling designs is that the effect of within-cluster exposures (i.e., exposures that vary within clusters) on outcome may be confounded by both measured and unmeasured cluster-level factors (i.e., measurements that do not vary within clusters). When some of these are ill/not accounted for, estimation of this effect through population-averaged models or random-effects models may introduce bias. We accommodate this by developing a general theory for the analysis of clustered data, which enables consistent and asymptotically normal estimation of the effects of within-cluster exposures in the presence of cluster-level confounders. Semiparametric efficient estimators are obtained by solving so-called conditional generalized estimating equations. We compare this approach with a popular proposal by Neuhaus and Kalbfleisch (1998, Biometrics 54, 638–645) who separate the exposure effect into a within- and a between-cluster component within a random intercept model. We find that the latter approach yields consistent and efficient estimators when the model is linear, but is less flexible in terms of model specification. Under nonlinear models, this approach may yield inconsistent and inefficient estimators, though with little bias in most practical settings.  相似文献   

8.
Zhu  Zhongyi; Fung  Wing K.; He  Xuming 《Biometrika》2008,95(4):907-917
There have been studies on how the asymptotic efficiency ofa nonparametric function estimator depends on the handling ofthe within-cluster correlation when nonparametric regressionmodels are used on longitudinal or cluster data. In particular,methods based on smoothing splines and local polynomial kernelsexhibit different behaviour. We show that the generalized estimationequations based on weighted least squares regression splinesfor the nonparametric function have an interesting property:the asymptotic bias of the estimator does not depend on theworking correlation matrix, but the asymptotic variance, andtherefore the mean squared error, is minimized when the truecorrelation structure is specified. This property of the asymptoticbias distinguishes regression splines from smoothing splines.  相似文献   

9.
Summary Cluster randomized trials in health care may involve three instead of two levels, for instance, in trials where different interventions to improve quality of care are compared. In such trials, the intervention is implemented in health care units (“clusters”) and aims at changing the behavior of health care professionals working in this unit (“subjects”), while the effects are measured at the patient level (“evaluations”). Within the generalized estimating equations approach, we derive a sample size formula that accounts for two levels of clustering: that of subjects within clusters and that of evaluations within subjects. The formula reveals that sample size is inflated, relative to a design with completely independent evaluations, by a multiplicative term that can be expressed as a product of two variance inflation factors, one that quantifies the impact of within‐subject correlation of evaluations on the variance of subject‐level means and the other that quantifies the impact of the correlation between subject‐level means on the variance of the cluster means. Power levels as predicted by the sample size formula agreed well with the simulated power for more than 10 clusters in total, when data were analyzed using bias‐corrected estimating equations for the correlation parameters in combination with the model‐based covariance estimator or the sandwich estimator with a finite sample correction.  相似文献   

10.
Summary .   Motivated by the spatial modeling of aberrant crypt foci (ACF) in colon carcinogenesis, we consider binary data with probabilities modeled as the sum of a nonparametric mean plus a latent Gaussian spatial process that accounts for short-range dependencies. The mean is modeled in a general way using regression splines. The mean function can be viewed as a fixed effect and is estimated with a penalty for regularization. With the latent process viewed as another random effect, the model becomes a generalized linear mixed model. In our motivating data set and other applications, the sample size is too large to easily accommodate maximum likelihood or restricted maximum likelihood estimation (REML), so pairwise likelihood, a special case of composite likelihood, is used instead. We develop an asymptotic theory for models that are sufficiently general to be used in a wide variety of applications, including, but not limited to, the problem that motivated this work. The splines have penalty parameters that must converge to zero asymptotically: we derive theory for this along with a data-driven method for selecting the penalty parameter, a method that is shown in simulations to improve greatly upon standard devices, such as likelihood crossvalidation. Finally, we apply the methods to the data from our experiment ACF. We discover an unexpected location for peak formation of ACF.  相似文献   

11.
Abstract. Generalized additive, generalized linear, and classification tree models were developed to predict the distribution of 20 species of chaparral and coastal sage shrubs within the southwest ecoregion of California. Mapped explanatory variables included bioclimatic attributes related to primary environmental regimes: averages of annual precipitation, minimum temperature of the coldest month, maximum temperature of the warmest month, and topographically-distributed potential solar insolation of the wettest quarter (winter) and of the growing season (spring). Also tested for significance were slope angle (related to soil depth) and the geographic coordinates of each observation. Models were parameterized and evaluated based on species presence/absence data from 906 plots surveyed on National Forest lands. Although all variables were significant in at least one of the species’ models, those models based only on the bioclimatic variables predicted species presence with 3–26% error. While error would undoubtedly be greater if the models were evaluated using independent data, results indicate that these models are useful for predictive mapping – for interpolating species distribution data within the ecoregion. All three methods produced models with similar accuracy for a given species; GAMs were useful for exploring the shape of the response functions, GLMs allowed those response functions to be parameterized and their significance tested, and classification trees, while some-times difficult to interpret, yielded the lowest prediction errors (lower by 3–5%).  相似文献   

12.
13.
14.
利用线性混合效应模型模拟杉木人工林枝条生物量   总被引:2,自引:0,他引:2  
基于福建省将乐林场45株人工杉木解析木的572组枝条生物量数据,采用线性混合效应模型方法,建立杉木人工林枝条总生物量和枝、叶生物量的预测模型,并利用独立样本数据对模型进行检验.结果表明: 线性混合效应模型比传统多元线性回归模型的拟合精度高.不同随机效应参数的组合,其混合模型的精度不同.考虑异方差结构的混合模型能够消除数据间的异方差性,其精度更高,其中,对于枝条总生物量和叶生物量模型,以指数函数作为异方差结构时的模型精度最高;对于枝生物量模型,以常数加幂函数作为异方差结构时的模型精度最高.模型检验结果表明:对于杉木人工林枝条生物量预测模型,考虑随机效应和异方差结构的线性混合模型的检验精度比传统多元线性回归模型的精度有明显提高.  相似文献   

15.
16.
Species’ distribution models are widely used in landscape ecology but usually lack explicit information about species’ responses to ecosystem dynamics, leading to uncertainty when applied to the prediction of seasonal change in distributions. In this study, we aimed to build a species’ distribution model for the Common Quail Coturnix coturnix, a farmland species that shows changes in its distribution in response to seasonal changes in habitat suitability. During the course of three breeding seasons we collected temporal replicates of presence–absence data in 13 sampling locations in four countries (Morocco, Portugal, Spain and France). We used generalized linear mixed models to relate the species’ presence or absence to environmental variables and to the normalized difference vegetation index at each sampling location through the seasons, the latter variable being an indicator of within‐ and between‐season habitat changes. The preferred model showed that occurrence was highly dependent on habitat changes associated with crop seasonality, as measured by the normalized difference vegetation index. Common Quail selected areas with dense vegetation and warm climate and tracked spatial changes in these two parameters. The model allows accurate mapping of within‐ and between‐season distribution changes. Such changes are related to habitat variations caused mainly by drought and agricultural practices. Our results demonstrate that seasonal changes in farmland ecosystems can be incorporated into a simple distribution model, and our approach could be applied to other species to predict the effects of agricultural changes on the distribution of birds inhabiting farmland landscapes.  相似文献   

17.
18.
Users of analysis of variance (ANOVA) procedures are accustomed to an ANOVA table, followed by a table of means. When the underlying linear model is variance‐balanced, i.e. the standard error of a difference is constant for all pairwise comparisons, non‐significant differences can be indicated by underlining. Unfortunately, when the design is unbalanced, it may turn out to be impossible to consistently represent all significant differences by standard underlining procedures. This paper proposes simple approaches, which allow a “connected lines” representation of treatment comparisons in the unbalanced case. The price for the improved display of results is a potential need to set‐aside some significances and report them separately. Experience shows that often all significances can be displayed by underlining, especially when variance‐imbalance is moderate. Alternatively, a “broken lines” representation can be used, which is guaranteed to allow a display of all significances. This type of display seems particularly suitable for implementation as a letters representation into statistical packages for linear models.  相似文献   

19.
Abstract. Quantitative response surfaces obtained from three performance measures, density, cover and volume, are compared, using as an example the spatial distribution of Periploca angustifolia (Asclepiadaceae) in SE Spain. Generalized linear models are used to examine relationships between these species performance measures and complex gradients of aspect, slope angle and altitude. All three performance measures showed a skewed response to the environmental gradients, unlike the Gaussian responses commonly assumed in vegetation theory; skewness increased as the number of dimensions of the performance measures increased. Certain asymmetries between the responses are discussed in terms of competition, and problems related to the use of complex gradients are considered.  相似文献   

20.
The outgrowth of tiller buds in Poaceae is influenced by the ratio of red to far-red light irradiance (R:FR). At each point in the plant canopy, R:FR is affected by light scattered by surrounding plant tissues. This paper presents a three-dimensional virtual plant modelling approach to simulate local effects of R:FR on tillering in spring wheat (Triticum aestivum). R:FR dependence of bud outgrowth was implemented in a wheat model, using three hypothetical responses of bud extension to R:FR (unit step, curvilinear and linear response). Bud break occurred when a threshold bud length was reached. Simulations were performed for three plant population densities. In accordance with experimental observations, fewer tillers per plant were simulated for higher plant population densities. The linear and curvilinear responses caused a delay in the increase in tiller number compared with experimental data. The unit step response approached experimental results best. It is suggested that a model based on relatively simple relations can be used to simulate degree of tillering. This study has shown that the virtual plant approach is a promising tool with which to address crop morphological and ecological research questions where the determining factors act at the level of the individual plant organ.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号