首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 23 毫秒
1.
In the analysis of repeated measurements, multivariate regression models that account for the correlations among the observations from the same subject are widely used. Like the usual univariate regression models, these multivariate regression models also need some model diagnostic procedures. Though these models have been widely used, not many studies have been performed in model diagnostic areas. In this paper, we propose simple residual plots to investigate the goodness of model fit for repeated measures data. Here, we mainly focus on the mean model diagnostics. The proposed residual plots are based on the quantile‐quantile(Q–Q) plots of a χ2 distribution and a normal distribution. In particular, the proposed model is useful in comparing several models simultaneously. The proposed method is illustrated using two examples. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

2.
Evaluating the goodness of fit of logistic regression models is crucial to ensure the accuracy of the estimated probabilities. Unfortunately, such evaluation is problematic in large samples. Because the power of traditional goodness of fit tests increases with the sample size, practically irrelevant discrepancies between estimated and true probabilities are increasingly likely to cause the rejection of the hypothesis of perfect fit in larger and larger samples. This phenomenon has been widely documented for popular goodness of fit tests, such as the Hosmer-Lemeshow test. To address this limitation, we propose a modification of the Hosmer-Lemeshow approach. By standardizing the noncentrality parameter that characterizes the alternative distribution of the Hosmer-Lemeshow statistic, we introduce a parameter that measures the goodness of fit of a model but does not depend on the sample size. We provide the methodology to estimate this parameter and construct confidence intervals for it. Finally, we propose a formal statistical test to rigorously assess whether the fit of a model, albeit not perfect, is acceptable for practical purposes. The proposed method is compared in a simulation study with a competing modification of the Hosmer-Lemeshow test, based on repeated subsampling. We provide a step-by-step illustration of our method using a model for postneonatal mortality developed in a large cohort of more than 300 000 observations.  相似文献   

3.
This paper addresses testing the goodness of fit of models for marginal probabilities estimated by generalized estimating equations. We develop a modified version of generalized estimating equation and a goodness‐of‐fit test based on the fitted marginal means. The test statistic is easy to compute and has a simple reference distribution. Its performance is evaluated asymptotically and in small samples. It is also compared to the deviance and Pearson X2 statistics. Example applications are given. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

4.
The well known χ2 goodness of fit test for a multinomial distribution is generally biased when observations are subject to misclassification. In this paper, based on a double sampling scheme, the family of φ‐divergence test statistics is introduced for testing goodness of fit under misclassification of the data. The case of binomial data is discussed and an illustrative example is also given.  相似文献   

5.
Chang Xuan Mao  Jun Li 《Biometrics》2009,65(4):1063-1067
Summary Comparing species assemblages given incidence‐based data is of importance in ecological studies, often done by a visual inspection of estimated species accumulation curves or by an ad hoc use of 95% pointwise confidence bands of these curves. It is shown that comparing species assemblages is a challenging problem. A χ2 test is proposed. An adjustment using an eigenvalue decomposition is proposed to overcome computational difficulties. The bootstrap method is also suggested to approximate the distribution of the proposed test statistic. The eigenvalue adjusted (Eva) χ2 test and the Eva‐bootstrap test are assessed by a simulation study. Both the Eva‐χ2 and the Eva‐bootstrap tests are applied to a study that involves two woody seedling species assemblages.  相似文献   

6.
Model checking for ROC regression analysis   总被引:1,自引:0,他引:1  
Cai T  Zheng Y 《Biometrics》2007,63(1):152-163
Summary .   The receiver operating characteristic (ROC) curve is a prominent tool for characterizing the accuracy of a continuous diagnostic test. To account for factors that might influence the test accuracy, various ROC regression methods have been proposed. However, as in any regression analysis, when the assumed models do not fit the data well, these methods may render invalid and misleading results. To date, practical model-checking techniques suitable for validating existing ROC regression models are not yet available. In this article, we develop cumulative residual-based procedures to graphically and numerically assess the goodness of fit for some commonly used ROC regression models, and show how specific components of these models can be examined within this framework. We derive asymptotic null distributions for the residual processes and discuss resampling procedures to approximate these distributions in practice. We illustrate our methods with a dataset from the cystic fibrosis registry.  相似文献   

7.
The coefficient of determination (R2) is a common measure of goodness of fit for linear models. Various proposals have been made for extension of this measure to generalized linear and mixed models. When the model has random effects or correlated residual effects, the observed responses are correlated. This paper proposes a new coefficient of determination for this setting that accounts for any such correlation. A key advantage of the proposed method is that it only requires the fit of the model under consideration, with no need to also fit a null model. Also, the approach entails a bias correction in the estimator assessing the variance explained by fixed effects. Three examples are used to illustrate new measure. A simulation shows that the proposed estimator of the new coefficient of determination has only minimal bias.  相似文献   

8.
Species distribution models (SDMs) are important management tools for highly mobile marine species because they provide spatially and temporally explicit information on animal distribution. Two prevalent modeling frameworks used to develop SDMs for marine species are generalized additive models (GAMs) and boosted regression trees (BRTs), but comparative studies have rarely been conducted; most rely on presence‐only data; and few have explored how features such as species distribution characteristics affect model performance. Since the majority of marine species BRTs have been used to predict habitat suitability, we first compared BRTs to GAMs that used presence/absence as the response variable. We then compared results from these habitat suitability models to GAMs that predict species density (animals per km2) because density models built with a subset of the data used here have previously received extensive validation. We compared both the explanatory power (i.e., model goodness of fit) and predictive power (i.e., performance on a novel dataset) of the GAMs and BRTs for a taxonomically diverse suite of cetacean species using a robust set of systematic survey data (1991–2014) within the California Current Ecosystem. Both BRTs and GAMs were successful at describing overall distribution patterns throughout the study area for the majority of species considered, but when predicting on novel data, the density GAMs exhibited substantially greater predictive power than both the presence/absence GAMs and BRTs, likely due to both the different response variables and fitting algorithms. Our results provide an improved understanding of some of the strengths and limitations of models developed using these two methods. These results can be used by modelers developing SDMs and resource managers tasked with the spatial management of marine species to determine the best modeling technique for their question of interest.  相似文献   

9.
We present a test of goodness of fit for the proportional hazard regression model. The test is based on a score statistic for testing against local mixture alternatives. Contrary to the findings of several other authors, we detect a significant lack of fit in Freireich's leukemia data.  相似文献   

10.
An efficient recursive polynomial multiplication method is proposed for exact unconditional power calculation for unordered 2 × K contingency table with up to moderate sample size. Our method can be applied to the family of cell-additive statistics which includes the Freeman-Halton statistic, the Pearson χ2 statistic and the likelihood ratio statistic. We illustrate our proposed method by several numerical examples.  相似文献   

11.
Summary The median failure time is often utilized to summarize survival data because it has a more straightforward interpretation for investigators in practice than the popular hazard function. However, existing methods for comparing median failure times for censored survival data either require estimation of the probability density function or involve complicated formulas to calculate the variance of the estimates. In this article, we modify a K ‐sample median test for censored survival data ( Brookmeyer and Crowley, 1982 , Journal of the American Statistical Association 77, 433–440) through a simple contingency table approach where each cell counts the number of observations in each sample that are greater than the pooled median or vice versa. Under censoring, this approach would generate noninteger entries for the cells in the contingency table. We propose to construct a weighted asymptotic test statistic that aggregates dependent χ2 ‐statistics formed at the nearest integer points to the original noninteger entries. We show that this statistic follows approximately a χ2 ‐distribution with k? 1 degrees of freedom. For a small sample case, we propose a test statistic based on combined p ‐values from Fisher’s exact tests, which follows a χ2 ‐distribution with 2 degrees of freedom. Simulation studies are performed to show that the proposed method provides reasonable type I error probabilities and powers. The proposed method is illustrated with two real datasets from phase III breast cancer clinical trials.  相似文献   

12.
Di CZ  Liang KY 《Biometrics》2011,67(4):1249-1259
Summary We consider likelihood ratio tests (LRT) and their modifications for homogeneity in admixture models. The admixture model is a two‐component mixture model, where one component is indexed by an unknown parameter while the parameter value for the other component is known. This model is widely used in genetic linkage analysis under heterogeneity in which the kernel distribution is binomial. For such models, it is long recognized that testing for homogeneity is nonstandard, and the LRT statistic does not converge to a conventional χ2 distribution. In this article, we investigate the asymptotic behavior of the LRT for general admixture models and show that its limiting distribution is equivalent to the supremum of a squared Gaussian process. We also discuss the connection and comparison between LRT and alternative approaches such as modifications of LRT and score tests, including the modified LRT ( Fu, Chen, and Kalbfleisch, 2006 , Statistica Sinica 16 , 805–823). The LRT is an omnibus test that is powerful to detect general alternative hypotheses. In contrast, alternative approaches may be slightly more powerful to detect certain type of alternatives, but much less powerful for others. Our results are illustrated by simulation studies and an application to a genetic linkage study of schizophrenia.  相似文献   

13.
The problem of categorial data analysis in survey sampling arises because of non‐independence of sample elements of the sample obtained through imposed sampling design. In this article the performance of modified χ2 statistic for testing independence of attributes have been evaluated for small sample sizes with the help of log‐linear models with respect to its achieved level of significance for fixed nominal level at 5%, through simulation technique. It hase been observed that the perfiormance of these test statistics depends on average and coefficient of variation of eigen values of design effect matrix. The first order corrected statistic is able to capture the effect of sampling design to a great extent but the performance of second order corrected statistic is much better. Further, these modified χ2 test statistics were applied to a real survey data and their performance were evaluated with respect to their achievied level of significance.  相似文献   

14.
Epidemiological studies often include numerous covariates, with a variety of possible approaches to control for confounding of the association of primary interest, as well as a variety of possible models for the exposure–response association of interest. Walsh and Kaiser (Radiat Environ Biophys 50:21–35, 2011) advocate a weighted averaging of the models, where the weights are a function of overall model goodness of fit and degrees of freedom. They apply this method to analyses of radiation–leukemia mortality associations among Japanese A-bomb survivors. We caution against such an approach, noting that the proposed model averaging approach prioritizes the inclusion of covariates that are strong predictors of the outcome, but which may be irrelevant as confounders of the association of interest, and penalizes adjustment for covariates that are confounders of the association of interest, but may contribute little to overall model goodness of fit. We offer a simple illustration of how this approach can lead to biased results. The proposed model averaging approach may also be suboptimal as way to handle competing model forms for an exposure–response association of interest, given adjustment for the same set of confounders; alternative approaches, such as hierarchical regression, may provide a more useful way to stabilize risk estimates in this setting.  相似文献   

15.
林火预测预报是科学有效进行林火管理的前提,是林业管理部门和科研工作者的广泛关注的领域。逻辑斯蒂回归(Logistic Regression,LR)是目前国内外广泛应用于森林火灾预测的模型方法,然而近年来有学者发现该方法没有充分考虑林火影响因子的空间相关性和异质性,从而导致模型拟合结果偏差。地理加权逻辑斯蒂回归(Geographically weighted logistic regression,GWR)模型考虑到了模型变量之间的空间相关性,有效提高的模型的拟合能力。为探讨GWLR模型在福建林火预测上的适用性,本研究应用LR和GWLR两种方法分别建立福建省森林火灾与气象因子的预测模型,通过模型拟合能力对比,判断在GWLR的适用性。研究以2000—2005年福建地区森林火灾卫星火点数据和每日气象因子为基础,将全样本分为60%的建模数据和40%的校验数据,并重复5次,建立5个样本组。选择在5个样本组中3个及以上表现显著的变量进入最终模型。研究结果表明GWLR在模型拟合度、模型残差、空间自相关性以及预测准确率等方面均优于LR模型,说明充分考虑模型变量的空间异质性有助于提高模型的预测精度,同时也验证了GWLR在福建地区林火预测上的适应性。此外,模型参数结果显示,"日最高地表气温"、"日最低地表气温"、"日平均风速"、"24小时降水量"、"日最高本站气压"、"日照时数"、"日最高气温"和"日最小相对湿度"8个因子对福建省林火发生有显著影响,研究结论为福建地区林火预测预报提供了新的方法。  相似文献   

16.
An estimate of the risk, adjusted for confounders, can be obtained from a fitted logistic regression model, but it substantially over-estimates when the outcome is not rare. The log binomial model, binomial errors and log link, is increasingly being used for this purpose. However this model's performance, goodness of fit tests and case-wise diagnostics have not been studied. Extensive simulations are used to compare the performance of the log binomial, a logistic regression based method proposed by Schouten et al. (1993) and a Poisson regression approach proposed by Zou (2004) and Carter, Lipsitz, and Tilley (2005). Log binomial regression resulted in "failure" rates (non-convergence, out-of-bounds predicted probabilities) as high as 59%. Estimates by the method of Schouten et al. (1993) produced fitted log binomial probabilities greater than unity in up to 19% of samples to which a log binomial model had been successfully fit and in up to 78% of samples when the log binomial model fit failed. Similar percentages were observed for the Poisson regression approach. Coefficient and standard error estimates from the three models were similar. Rejection rates for goodness of fit tests for log binomial fit were around 5%. Power of goodness of fit tests was modest when an incorrect logistic regression model was fit. Examples demonstrate the use of the methods. Uncritical use of the log binomial regression model is not recommended.  相似文献   

17.
A rapid method for predicting the buckwheat flour ratio of dried buckwheat noodles was developed by using the fluorescence fingerprint and partial least squares regression. Fitting the calibration model to validation datasets showed R 2=0.78 and SEP=12.4%. The model was refined for a better fit by deleting several samples containing additional ingredients. The best fit was finally obtained (R 2=0.84 and SEP=10.4%) by deleting the samples containing vinegar, green tea, seaweed, polysaccharide thickener, and yam. This result demonstrates that a calibration model with high accuracy could be constructed based on samples similar in material composition. The developed methodology requires no complex preprocessing, enables rapid measurement with a small sample amount, and would thus be suitable for practical application to the food industry.  相似文献   

18.
Sexual size dimorphism (SSD) is widespread in animals, especially in lizards (Reptilia: Squamata), and is driven by fecundity selection, male–male competition, or other adaptive hypotheses. However, these selective pressures may vary through different life history periods; thus, it is essential to assess the relationship between growth and SSD. In this study, we tracked SSD dynamics between a “fading‐tail color skink” (blue tail skink whose tail is only blue during its juvenile stage: Plestiodon elegans) and a “nonfade color” tail skink (retains a blue tail throughout life: Plestiodon quadrilineatus) under a controlled experimental environment. We fitted growth curves of morphological traits (body mass, SVL, and TL) using three growth models (Logistic, Gompertz, and von Bertalanffy). We found that both skinks have male‐biased SSD as adults. Body mass has a higher goodness of fit (as represented by very high R2 values) using the von Bertalanffy model than the other two models. In contrast, SVL and TL for both skinks had higher goodness of fit when using the Gompertz model. Two lizards displayed divergent life history tactics: P. elegans grows faster, matures earlier (at 65 weeks), and presents an allometric growth rate, whereas P. quadrilineatus grows slower, matures later (at 106 weeks), and presents an isometric growth rate. Our findings imply that species‐ and sex‐specific trade‐offs in the allocation of energy to growth and reproduction may cause the growth patterns to diverge, ultimately resulting in the dissimilar patterns of SSD.  相似文献   

19.
The mathematical basis of a widely-known variance-mean power relationship of ecological populations was examined. It is shown that the log variance (S 2)—log mean, (m) plot is virtually delimited by two lines logS 2=logn+2 logm and logS 2=logm, thus increasing the chance that a linear regression line can be successfully fitted, without a profoundly behavioural background. This makes difficult the task of interpreting a successful fit of the power law regression and its parameterb in a biologically meaningful manner. In comparison with the power law regression, Iwao'sm *-m regression is structurally less constrained, i.e. has a wider spatial region in which data points can scatter. This suggests that a comparison between the two methods in terms of how good a fit is achieved for a particular data set is largely meaningless, since the power law regression may inherently produce a better fit due to its constrained spatial entity. Furthermore, it could be argued that a successful fit in Iwao's method, when found, is less taxed with mathematical arterfacts and perhaps more clearly linked to some biological mechanisms underlying spatial dispersion of populations.  相似文献   

20.
环境异质性对野生动物分布的影响具有明显的空间不均匀性。传统分析中多采用经典线性回归模型来量化野生动物分布与环境变量之间的关系,难以准确反映物种-环境关系的空间异质特征。地理加权回归(GWR)是近年来提出的一种新的空间分析方法,通过将空间结构嵌入线性回归模型中,以此来探测空间关系的非均匀性。以秦岭大熊猫为例,应用GWR模型分析大熊猫空间分布与环境异质性特征之间的潜在关系,并同经典的全局最小二乘回归法(OLS)进行比较。结果表明,GWR模型的AIC、R2和校正R2均显著优于OLS模型,GWR模型的局部回归系数估计能够更加深刻地揭示大熊猫空间分布与环境变量间的复杂空间关系,且GWR模型能够为物种的科学保护提供更加有效的理论支撑。因此,GWR模型可为探究物种-环境关系的空间异质特征提供一种新的方法,在物种栖息地选择与利用研究中具有一定的应用前景。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号