首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
K F Hirji 《Biometrics》1991,47(2):487-496
A recently developed algorithm for generating the distribution of sufficient statistics for conditional logistic models can be put to a twofold use. First, it provides an avenue for performing inference for matched case-control studies that does not rely on the assumption of a large sample size. Second, joint distributions generated by this algorithm can be used to make comparisons of various inferential procedures that are free from Monte Carlo sampling errors. In this paper, these two features of the algorithm are utilized to compare small-sample properties of the exact, mid-P value, and score tests for a conditional logistic model with two unmatched binary covariates. Both uniparametric and multiparametric tests, performed at a nominal significance level of .05, were studied. It was found that the actual significance levels of the mid-P test tend to be closer to the nominal level when compared with those of the other two tests.  相似文献   

2.
In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.  相似文献   

3.
In this article, we describe a conditional score test for detecting a monotone dose‐response relationship with ordinal response data. We consider three different versions of this test: asymptotic, conditional exact, and mid‐P conditional score test. Exact and asymptotic power formulae based on these tests will be studied. Asymptotic sample size formulae based on the asymptotic conditional score test will be derived. The proposed formulae are applied to a vaccination study and a developmental toxicity study for illustrative purposes. Actual significance level and exact power properties of these tests are compared in a small empirical study. The mid‐P conditional score test is observed to be the most powerful test with actual significance level close to the pre‐specified nominal level.  相似文献   

4.
Summary .  Many assessment instruments used in the evaluation of toxicity, safety, pain, or disease progression consider multiple ordinal endpoints to fully capture the presence and severity of treatment effects. Contingency tables underlying these correlated responses are often sparse and imbalanced, rendering asymptotic results unreliable or model fitting prohibitively complex without overly simplistic assumptions on the marginal and joint distribution. Instead of a modeling approach, we look at stochastic order and marginal inhomogeneity as an expression or manifestation of a treatment effect under much weaker assumptions. Often, endpoints are grouped together into physiological domains or by the body function they describe. We derive tests based on these subgroups, which might supplement or replace the individual endpoint analysis because they are more powerful. The permutation or bootstrap distribution is used throughout to obtain global, subgroup, and individual significance levels as they naturally incorporate the correlation among endpoints. We provide a theorem that establishes a connection between marginal homogeneity and the stronger exchangeability assumption under the permutation approach. Multiplicity adjustments for the individual endpoints are obtained via stepdown procedures, while subgroup significance levels are adjusted via the full closed testing procedure. The proposed methodology is illustrated using a collection of 25 correlated ordinal endpoints, grouped into six domains, to evaluate toxicity of a chemical compound.  相似文献   

5.
With interest in spatial ecology growing, correlational field studies are likely to become increasingly important. Unfortunately, ecological field data often do not follow the assumptions of classical statistics, so techniques like the popular and powerful multiple linear regression and its variants are often unreliable, and results can be misleading. The generalized linear model (GLM) is a flexible extension of linear regression that has proved especially useful for discrete data. In this paper, the technique is adapted to accommodate spatially correlated, discrete data. Specifically, to demonstrate the approach, Japanese beetle grub [Popillia japonica Newman (Coleoptera, Scarabaeidae)] population density in the field is modeled as a function of soil organic matter content. The response variable (grub counts in small soil samples) was a spatially autocorrelated, discrete random variable. Three classes of GLMs of the association between soil organic matter content and grub density were compared: (i) regression (assuming normally distributed response variable), (ii) GLM assuming negative binomial counts, and (iii) GLM based on the assumption that the counts conformed to Taylor's power law (TPL). Because the grubs were distributed in patches rather than at random, models that explicitly accounted for the spatial autocorrelation of grub counts were constructed, and compared with models that assumed independent observations. The fitted values for the discrete GLMs [viz., (ii) and (iii)] differed noticeably from the fitted values from multiple regression; but fitted values among the negative binomial and TPL GLMs were virtually identical, regardless of whether the spatial covariance was incorporated into a model, whether a spherical or exponential variogram model was used, or whether variance function parameters were estimated over a large or small scale. However, P‐values for the overall significance of the models depended heavily on whether the GLM assumed a discrete or continuous response variable, and whether or not spatial autocorrelation in the response variable was accounted for. On average, P‐values were 45‐fold higher in the spatial GLMs than in the non‐spatial and 23‐fold higher in the discrete GLMs than in the continuous.  相似文献   

6.
Overdispersion is a common phenomenon in Poisson modeling, and the negative binomial (NB) model is frequently used to account for overdispersion. Testing approaches (Wald test, likelihood ratio test (LRT), and score test) for overdispersion in the Poisson regression versus the NB model are available. Because the generalized Poisson (GP) model is similar to the NB model, we consider the former as an alternate model for overdispersed count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes a score test for overdispersion based on the GP model and compares the power of the test with the LRT and Wald tests. A simulation study indicates the score test based on asymptotic standard Normal distribution is more appropriate in practical application for higher empirical power, however, it underestimates the nominal significance level, especially in small sample situations, and examples illustrate the results of comparing the candidate tests between the Poisson and GP models. A bootstrap test is also proposed to adjust the underestimation of nominal level in the score statistic when the sample size is small. The simulation study indicates the bootstrap test has significance level closer to nominal size and has uniformly greater power than the score test based on asymptotic standard Normal distribution. From a practical perspective, we suggest that, if the score test gives even a weak indication that the Poisson model is inappropriate, say at the 0.10 significance level, we advise the more accurate bootstrap procedure as a better test for comparing whether the GP model is more appropriate than Poisson model. Finally, the Vuong test is illustrated to choose between GP and NB2 models for the same dataset.  相似文献   

7.
8.
Interim analyses in clinical trials are planned for ethical as well as economic reasons. General results have been published in the literature that allow the use of standard group sequential methodology if one uses an efficient test statistic, e.g., when Wald-type statistics are used in random-effects models for ordinal longitudinal data. These models often assume that the random effects are normally distributed. However, this is not always the case. We will show that, when the random-effects distribution is misspecified in ordinal regression models, the joint distribution of the test statistics over the different interim analyses is still a multivariate normal distribution, but a sandwich-type correction to the covariance matrix is needed in order to obtain the correct covariance matrix. The independent increment structure is also investigated. A bias in estimation will occur due to the misspecification. However, we will also show that the treatment effect estimate will be unbiased under the null hypothesis, thus maintaining the type I error. Extensive simulations based on a toenail dermatophyte onychomycosis trial are used to illustrate our results.  相似文献   

9.
Although a number of regression models for ordinal responses have been proposed, these models are not widely known and applied in epidemiology and biomedical research. Overviews of these models are either highly technical or consider only a small part of this class of models so that it is difficult to understand the features of the models and to recognize important relations between them. In this paper we give an overview of logistic regression models for ordinal data based upon cumulative and conditional probabilities. We show how the most popular ordinal regression models, namely the proportional odds model and the continuation ratio model, are embedded in the framework of generalized linear models. We describe the characteristics and interpretations of these models and show how the calculations can be performed by means of SAS and S‐Plus. We illustrate and compare the methods by applying them to data of a study investigating the effect of several risk factors on diabetic retinopathy. A special aspect is the violation of the usual assumption of equal slopes which makes the correct application of standard models impossible. We show how to use extensions of the standard models to work adequately with this situation.  相似文献   

10.
Comparative studies on cnidocysts, involving adequate statistical treatment, are very scarce. Classical statistical tests are frequently used assuming normal frequency distributions of capsule lengths, but many distributions are non-normal in acontiarian sea anemones. A traditional choice in these situations are non-parametric tests, although they are not as powerful as parametric tests. An extension of classical methods was developed by some authors; these models, called Generalized Linear Models (GLM), can be used under certain conditions with non-normal data. In view of the properties of our data, that are positive, skewed and with constant coefficient of variation, a GLM with gamma distribution and inverse link function was chosen to analyse the cnidae of acontia from the species Haliplanella lineata, Tricnidactis errans and Anthothoe chilensis. Graphical analysis of residuals showed that these assumptions were reasonable. This method allowed us to avoid transformation of data set and controversial cases in the limit of significance level. For this task, appropriate subroutines in GLIM language were written. In all cases highly significant differences were found between the specimens considered for every species and nematocyst type (b-rhabdoids, p-rhabdoids B1b and p-rhabdoids B2a).  相似文献   

11.
We consider a nonparametric (NP) approach to the analysis of repeated measures designs with censored data. Using the NP model of Akritas and Arnold (1994, Journal of the American Statistical Association 89, 336-343) for marginal distributions, we present test procedures for the NP hypotheses of no main effects, no interaction, and no simple effects. This extends the existing NP methodology for such designs (Wei and Lachin, 1984, Journal of the American Statistical Association 79, 653-661). The procedures do not require any modeling assumptions and should be useful in cases where the assumptions of proportional hazards or location shift fail to be satisfied. The large-sample distribution of the test statistics is based on an i.i.d. representation for Kaplan-Meier integrals. The testing procedures apply also to ordinal data and to data with ties. Useful small-sample approximations are presented, and their performance is examined in a simulation study. Finally, the methodology is illustrated with two real life examples, one with censored and one with missing data. It is indicated that one of the data sets does not conform to any set of assumptions underlying the available methods and also that the present method provides a useful additional analysis even when data sets conform to modeling assumptions.  相似文献   

12.
Keywords Multitemporal data sets from the Landsat Thematic Mapper (TM) were used to evaluate their applicability for exploratory soil mapping in the floodplain of the Northern Pantanal of Mato Grosso, Brazil. Fifty-four soil profiles were classified into 21 soil units according to the FAO–UNESCO system. Information layers of vegetation types and dynamics of flooding were elaborated by applying supervised hierarchical classification rules. Geomorphologic units were mapped by visual image interpretation. Multinomial logistic regression was applied to test relations between thematic layers and soil units as well as aggregated soil clusters, developing a statistical mapping model. Northern Pantanal floodplain soils show a high variability as a function of age and granulemetry of underlying sediments, as well as soil moisture and flooding regimes. GIS layers of nine vegetation formations, three geomorphologic units and three multi-temporal moisture types were elaborated. Cross-tabulations and multinomial logistic regression models indicate significant relations between FAO–UNESCO soil units and GIS layers. As soil sampling density had been low, a final predictive model was developed for the mapping of six aggregated soil clusters, obtaining a high significance level (p<0.05) for prediction. Applied methodology was found to be appropriate to develop models on soil–landscape relationships and improve information on spatial distribution of soil groupings in the Northern Pantanal.  相似文献   

13.
OBJECTIVES: The association of a candidate gene with disease can be evaluated by a case-control study in which the genotype distribution is compared for diseased cases and unaffected controls. Usually, the data are analyzed with Armitage's test using the asymptotic null distribution of the test statistic. Since this test does not generally guarantee a type I error rate less than or equal to the significance level alpha, tests based on exact null distributions have been investigated. METHODS: An algorithm to generate the exact null distribution for both Armitage's test statistic and a recently proposed modification of the Baumgartner-Weiss-Schindler statistic is presented. I have compared the tests in a simulation study. RESULTS: The asymptotic Armitage test is slightly anticonservative whereas the exact tests control the type I error rate. The exact Armitage test is very conservative, but the exact test based on the modification of the Baumgartner-Weiss-Schindler statistic has a type I error rate close to alpha. The exact Armitage test is the least powerful test; the difference in power between the other two tests is often small and the comparison does not show a clear winner. CONCLUSION: Simulation results indicate that an exact test based on the modification of the Baumgartner-Weiss-Schindler statistic is preferable for the analysis of case-control studies of genetic markers.  相似文献   

14.
Question: We provide a method to calculate the power of ordinal regression models for detecting temporal trends in plant abundance measured as ordinal cover classes. Does power depend on the shape of the unobserved (latent) distribution of percentage cover? How do cover class schemes that differ in the number of categories affect power? Methods: We simulated cover class data by “cutting‐up” a continuous logit‐beta distributed variable using 7‐point and 15‐point cover classification schemes. We used Monte Carlo simulation to estimate power for detecting trends with two ordinal models, proportional odds logistic regression (POM) and logistic regression with cover classes re‐binned into two categories, a model we term an assessment point model (APM). We include a model fit to the logit‐transformed percentage cover data for comparison, which is a latent model. Results: The POM had equal or higher power compared to the APM and latent model, but power varied in complex ways as a function of the assumed latent beta distribution. We discovered that if the latent distribution is skewed, a cover class scheme with more categories might yield higher power to detect trend. Conclusions: Our power analysis method maintains the connection between the observed ordinal cover classes and the unmeasured (latent) percentage cover variable, allowing for a biologically meaningful trend to be defined on the percentage cover scale. Both the shape of the latent beta distribution and the alternative hypothesis should be considered carefully when determining sample size requirements for long‐term vegetation monitoring using cover class measurements.  相似文献   

15.
Association mapping can be a powerful tool for detecting quantitative trait loci (QTLs) without requiring line-crossing experiments. We previously proposed a Bayesian approach for simultaneously mapping multiple QTLs by a regression method that directly incorporates estimates of the population structure. In the present study, we extended our method to analyze ordinal and censored traits, since both types of traits are common in the evaluation of germplasm collections. Ordinal-probit and tobit models were employed to analyze ordinal and censored traits, respectively. In both models, we postulated the existence of a latent continuous variable associated with the observable data, and we used a Markov-chain Monte Carlo algorithm to sample the latent variable and determine the model parameters. We evaluated the efficiency of our approach by using simulated- and real-trait analyses of a rice germplasm collection. Simulation analyses based on real marker data showed that our models could reduce both false-positive and false-negative rates in detecting QTLs to reasonable levels. Simulation analyses based on highly polymorphic marker data, which were generated by coalescent simulations, showed that our models could be applied to genotype data based on highly polymorphic marker systems, like simple sequence repeats. For the real traits, we analyzed heading date as a censored trait and amylose content and the shape of milled rice grains as ordinal traits. We found significant markers that may be linked to previously reported QTLs. Our approach will be useful for whole-genome association mapping of ordinal and censored traits in rice germplasm collections.  相似文献   

16.
There are copula-based statistical models in the literature for regression with dependent data such as clustered and longitudinal overdispersed counts, for which parameter estimation and inference are straightforward. For situations where the main interest is in the regression and other univariate parameters and not the dependence, we propose a "weighted scores method", which is based on weighting score functions of the univariate margins. The weight matrices are obtained initially fitting a discretized multivariate normal distribution, which admits a wide range of dependence. The general methodology is applied to negative binomial regression models. Asymptotic and small-sample efficiency calculations show that our method is robust and nearly as efficient as maximum likelihood for fully specified copula models. An illustrative example is given to show the use of our weighted scores method to analyze utilization of health care based on family characteristics.  相似文献   

17.
We compare two models for the analysis of repeated ordinal categorical data: the classical parametric model for means of scores assigned to the categories of the response variable and a nonparametric model based on relative effects derived from the marginal distribution functions of the response. An example in the field of Dentistry is used to illustrate and to compare the models. We also consider a simulation study to evaluate the type‐I error rates and the power of tests under both models in a balanced design setup. The simulation results suggest that both approaches behave similarly for equally spaced scores but may perform differently otherwise. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

18.
Xie M  Simpson DG 《Biometrics》1999,55(1):308-316
This paper develops regression models for ordinal data with nonzero control response probabilities. The models are especially useful in dose-response studies where the spontaneous or natural response rate is nonnegligible and the dosage is logarithmic. These models generalize Abbott's formula, which has been commonly used to model binary data with nonzero background observations. We describe a biologically plausible latent structure and develop an EM algorithm for fitting the models. The EM algorithm can be implemented using standard software for ordinal regression. A toxicology data set where the proposed model fits the data but a more conventional model fails is used to illustrate the methodology.  相似文献   

19.
Y. X. Fu 《Genetics》1996,143(1):557-570
The purpose of this paper is to develop statistical tests of the neutral model of evolution against a class of alternative models with the common characteristic of having an excess of mutations that occurred a long time ago or a reduction of recent mutations compared to the neutral model. This class of population genetics models include models for structured populations, models with decreasing effective population size and models of selection and mutation balance. Four statistical tests were proposed in this paper for DNA samples from a population. Two of these tests, one new and another a modification of an existing test, are based on EWENS'' sampling formula, and the other two new tests make use of the frequencies of mutations of various classes. Using simulated samples and regression analyses, the critical values of these tests can be computed from regression equations. This approach for computing the critical values of a test was found to be appropriate and quite effective. We examined the powers of these four tests using simulated samples from structured populations, populations with linearly decreasing sizes and models of selection and mutation balance and found that they are more powerful than existing statistical tests of the neutral model of evolution.  相似文献   

20.
The aim of this work was to predict the worldwide distribution of two pest species-Ceratitis capitata (Wiedemann), the Mediterranean fruit fly, and Lymantria dispar (L.), the gypsy moth-based on climatic factors. The distribution patterns of insect pests have most often been investigated using classical statistical models or ecoclimatic assessment models such as CLIMEX. In this study, we used an artificial neural network, the multilayer perceptron, trained using the backpropagation algorithm, to model the distribution of each species. The data matrix used to model the distribution of each species was divided into three data sets to (1) develop and train the model, (2) validate the model and prevent over-fitting, and (3) test each model on novel data. The percentage of correct predictions of the global distribution of each species was high for Mediterranean fruit fly for the three data sets giving 95.8, 81.5, and 80.6% correct predictions, respectively, and 96.8, 84.3, and 81.5 for the gypsy moth. Kappa statistics used to test the level of significance of the results were highly significant (in all cases P < 0.0001). A sensitivity analysis applied to each model based on the calculation of the derivatives of each of a large number of input variables showed that the variables that contributed most to explaining the distribution of C. capitata were annual average temperature and annual potential evapotranspiration. For L. dispar, the average minimum temperature and minimum daylength range were the main explanatory variables. The ANN models and methods developed in this study offer powerful additional predictive approaches in invasive species research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号