首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
By the aid of analysing a medical example a three—step procedure for analysing multi—dimensional contingency tables is introduced. This procedure has some good properties. Step one is due to catch the relationship structure between the variables connected by the contingency table. Hereby only so—called graphical models, a subclass of hierarchical models in regard of the parameters of the log—linear model, are admitted. The models can be generated by combination of hypotheses of pairwise conditional independence. Hereby a so-called Extended Combination Procedure is proposed using the position of the Chain of (hierarchical) Hypotheses. A useful symbolic notation for ‘Dependence Models’ in addition to that in form of ‘Independence Models’ and ‘Minimal Sets’ is proposed. Step two analyses the significant conditional pairs in regard to the question for what attribute level combinations of the condition complexes the relations remain significant. Step three investigates those tables recognised as significant in step two more closely to get ideas about the ‘sources’ of dependencies and possibilities of collapsing parts of the table. The procedure is mostly used in explorative data analysis although the simple steps can be used to test hypotheses, too.  相似文献   

2.
Proceeding from Lancaster's definition of interactions between random variables, the authors set up a model for contingency tables of any dimension. Three-dimensional contingency tables are used as an example to discuss first and second order interaction effects, and the conventional independence are expressed by hypotheses concerning interaction effects. The opinions of other authors regarding second order interaction effects are discussed.  相似文献   

3.
A heuristic three-step procedure for analysing multidimensional contingency tables is given to meet the requirements of a mixed analysis from both hypotheses-ruled and data-ruled type. The first-step provides the structure of relationships among the attributes by fitting an appropriate unsaturated log-linear model to the data of the given contingency table. Restriction to elementary hierarchical models allows to get them by combining pairs of conditional independence. The result of the first step may be regarded as a certain validisation of real model ideas. In the second step the significant pairs of conditional dependence are analysed in regard to the levels of the condition complex. Only such significant pairs are to be considered, in general, where the condition complex does not include the response variable. The third-step may test special subtests in that significant two-dimensional tables found in step two or may extend the general statements by partitioning, the corresponding test statistics in additive components. Application examples demonstrate the general line of action.  相似文献   

4.
Configural frequency analysis (CFA) is a widely used method for the identification of types and syndromes in contingency tables. However, the type model of CFA shows some major deficiencies. In this paper, we propose an alternative modeling of types eliminating the shortcomings of CFA. Basically, a type is modeled as a combination of traits or symptoms that deviates from the pattern of association holding true for the complementary configurations of the contingency table. The new approach is formulated in terms of a log-linear model. It is shown that parameter estimation can be performed with methods known from the analysis of incomplete contingency tables. Test procedures for confirmatory analysis and methods for exploratory search for type configurations are developed. We illustrate the methodology with two practical examples.  相似文献   

5.
When contingency tables of data on sequences, social relationships, feeding, habitat use, or other behaviour exhibit significant associations between variables, ethologists may analyse the residuals in the table in order to test more precise hypotheses about the associations found. This paper critically evaluates currently used and potentially available statistical methods for performing such tests. Specific examples of use are given and recommendations made.  相似文献   

6.
Incomplete contingency tables, i.e. tables with structurally caused empty cells, are analysed by means of so-called quasilog-linear models. In general the expected values can be calculated by means of iterative cyclic adaption to corresponding marginals of the empirical contingency tables (in the same way as in complete tables) under different hierarchical hypotheses concerning the parameters of the models. For important cases of 2-dimensional contingency tables it is possible to demonstrate that expected values and test statistics are to find in a closed form. If all 2-dimensional sub or partial tables of a 3-dimensional table can be assigned to such cases then the hypotheses of classes (AB×C) (??), (B×C)/A(??), (A??B)/A(??) etc. are testable in closed form. But the expected values to (A×B×C) (×) have to be calculated iteratively. An example shows that some definite additive decompositions of the test statistic 2 I are no longer valid while some others remain valid in spite of incompleteness of the tables.  相似文献   

7.
Various one-sided tests for the comparison of differential treatment effects in two independent groups are developed for qualitative data. This is done for matched and for independent samples, in the case of identical and of different initial distributions. Tests for various kinds of treatment effects are considered. In each case the two observed contingency tables are collapsed into two independent fourfold tables. It is described in which way to select the appropriate procedure for practical problems, and most methods are illustrated in detail by means of a numerical example from clinical psychology.  相似文献   

8.
This paper shows that the sum of products models for three and higher order interactions in contingency tables can be reparameterized in the spirit of TUKEY (1949) to yield chi-square tests with one degree of freedom. The merits of this new test over the other known tests for the same hypotheses are discussed.  相似文献   

9.
Large contingency tables summarizing categorical variables arise in many areas. One example is in biology, where large numbers of biomarkers are cross‐tabulated according to their discrete expression level. Interactions of the variables are of great interest and are generally studied with log–linear models. The structure of a log–linear model can be visually represented by a graph from which the conditional independence structure can then be easily read off. However, since the number of parameters in a saturated model grows exponentially in the number of variables, this generally comes with a heavy computational burden. Even if we restrict ourselves to models of lower‐order interactions or other sparse structures, we are faced with the problem of a large number of cells which play the role of sample size. This is in sharp contrast to high‐dimensional regression or classification procedures because, in addition to a high‐dimensional parameter, we also have to deal with the analogue of a huge sample size. Furthermore, high‐dimensional tables naturally feature a large number of sampling zeros which often leads to the nonexistence of the maximum likelihood estimate. We therefore present a decomposition approach, where we first divide the problem into several lower‐dimensional problems and then combine these to form a global solution. Our methodology is computationally feasible for log–linear interaction models with many categorical variables each or some of them having many levels. We demonstrate the proposed method on simulated data and apply it to a bio‐medical problem in cancer research.  相似文献   

10.
Genetic association studies: design,analysis and interpretation   总被引:6,自引:0,他引:6  
This paper provides a review of the design and analysis of genetic association studies. In case control studies, the different contingency tables and their relationships to the underlying genetic model are defined. Population stratification is discussed, with suggested methods to identify and correct for the effect. The transmission disequilibrium test is provided as an alternative family-based test, which is robust to population stratification. The relative benefits of each analysis are summarised.  相似文献   

11.
Standard statistical analyses of distributions of individuals from contingency tables are generally invalid if the individuals are not distributed independently of each other. In this paper, we discuss a method of testing hypotheses about classification category occupancy rates for overdispersed population or for population whose individuals are distributed by groups rather than lonely. These methods are based on population redistribution simulations and provide valid, exact and powerful tests in situations for which classical methods are not appropriate. Illustrations are given from the European Corn Borer eggs data.  相似文献   

12.
Let categorical data coming from a control group and (r - 1) treated groups be given in an r × c contingency table. A simultaneous test procedure of the (r - 1) hypotheses that the probabilities of all c categories do not differ between the i-th treated group and the control is derived. For small tables and small cell frequencies it is exactly performed by generation of all tables having the given marginal sums. If 2 categories or 2 groups only are given the asymptotic distribution of the test statistic is known; otherwise its distribution may be simulated if the computational expenditure of performing an exact test is too large. By means of a Monte Carlo study it is shown that this method meets its level more reliably and that it has a better power than others.  相似文献   

13.
This paper examines various association, symmetry and “diagonal band” class models for both the British and Danish social mobility data. Composite models are also fitted to these data and the variety of models considered ensures that for most square tables, parsimonious models within the class of models examined in this study can always be found that will adequately describe such tables. The models considered in this study, which have been described in various forms by Goodman (1984), Upton (1985) and Tomizawa (1986) can suit most square tables having ordered classificatory variables. A model selection procedure is also examined.  相似文献   

14.
We address the problem of tests of homogeneity in two-way contingency tables in case-control studies when the case category is subdivided into k subcategories. In this situation, we have two cells with large frequencies and 2 X k cells with frequencies that become small as k increases. We propose two ad hoc statistics in which a statistic for the sparse cells is combined with a statistic for the cells with large frequencies. We will study these tests along with the Pearson test (using a chi-square approximation) in a Monte Carlo simulation study. Two sets of null hypothesis models and two sets of alternative hypothesis models are considered. The best test for the models considered is the usual Pearson test (using an approximate chi-square distribution) although the ad hoc models are more powerful under one alternative model considered.  相似文献   

15.
Lájer (2007) notes that, to investigate phytosociological and ecological relationships, many authors apply traditional inferential tests to sets of relevés obtained by non-random methods. Unfortunately, this procedure does not provide reliable support for hypothesis testing because non-random sampling violates the assumptions of independence required by many parametric inferential tests. Instead, a random sampling scheme is recommended. Nonetheless, random sampling will not eliminate spatial autocorrelation. For instance, a classical law of geography holds that everything in a piece of (biotic) space is interrelated, but near objects are more related than distant ones. Because most ecological processes that shape community structure and species coexistence are spatially explicit, spatial autocorrelation is a vital part of almost all ecological data. This means that, independently from the underlying sampling design, ecological data are generally spatially autocorrelated, violating the assumption of independence that is generally required by traditional inferential tests. To overcome this drawback, randomization tests may be used. Such tests evaluate statistical significance based on empirical distributions generated from the sample and do not necessarily require data independence. However, as concerns hypothesis testing, randomization tests are not the universal remedy for ecologists, because the choice of inadequate null models can have significant effects on the ecological hypotheses tested. In this paper, I emphasize the need of developing null models for which the statistical assumptions match the underlying biological mechanisms.  相似文献   

16.
A general multistage (stepwise) procedure is proposed for dealing with arbitrary gatekeeping problems including parallel and serial gatekeeping. The procedure is very simple to implement since it does not require the application of the closed testing principle and the consequent need to test all nonempty intersections of hypotheses. It is based on the idea of carrying forward the Type I error rate for any rejected hypotheses to test hypotheses in the next ordered family. This requires the use of a so-called separable multiple test procedure (MTP) in the earlier family. The Bonferroni MTP is separable, but other standard MTPs such as Holm, Hochberg, Fallback and Dunnett are not. Their truncated versions are proposed which are separable and more powerful than the Bonferroni MTP. The proposed procedure is illustrated by a clinical trial example.  相似文献   

17.
In modern molecular biology one of the standard ways of analyzing a vertebrate immune system is to sequence and compare the counts of specific antigen receptor clones (either immunoglobulins or T-cell receptors) derived from various tissues under different experimental or clinical conditions. The resulting statistical challenges are difficult and do not fit readily into the standard statistical framework of contingency tables primarily due to the serious under-sampling of the receptor populations. This under-sampling is caused, on one hand, by the extreme diversity of antigen receptor repertoires maintained by the immune system and, on the other, by the high cost and labor intensity of the receptor data collection process. In most of the recent immunological literature the differences across antigen receptor populations are examined via non-parametric statistical measures of the species overlap and diversity borrowed from ecological studies. While this approach is robust in a wide range of situations, it seems to provide little insight into the underlying clonal size distribution and the overall mechanism differentiating the receptor populations. As a possible alternative, the current paper presents a parametric method that adjusts for the data under-sampling as well as provides a unifying approach to a simultaneous comparison of multiple receptor groups by means of the modern statistical tools of unsupervised learning. The parametric model is based on a flexible multivariate Poisson-lognormal distribution and is seen to be a natural generalization of the univariate Poisson-lognormal models used in the ecological studies of biodiversity patterns. The procedure for evaluating a model's fit is described along with the public domain software developed to perform the necessary diagnostics. The model-driven analysis is seen to compare favorably vis a vis traditional methods when applied to the data from T-cell receptors in transgenic mice populations.  相似文献   

18.
An algorithm for correspondence analysis is described and implementedin SAS/IML (SAS Institute, 1985a). The technique is shown, throughthe analysis of several biological examples, to supplement thelog-linear models approach to the analysis of contingency tables,both in the model identification and model interpretation stagesof analysis. A simple two-way contingency table of tumor datais analyzed using correspondence analysis. This example emphasisesthe relationships between the parameters of the log-linear modelfor the table and the graphical correspondence analysis results.The technqiue is also applied to a three-way table of surveydata concerning ulcer patients to demonstrate applications ofsimple correspondence analysis to higher dimensional tableswith fixed margins. Finally, the diets and foraging behaviorsof birds of the Hubbard Brook Forest are each analyzed and thena simultaneous display of the two separate but related tablesis constructed to highlight relationships between the tables. Received on August 29, 1988; accepted on April 25, 1989  相似文献   

19.
PARCAT is a computer program which implements alternative tests for average partial association in three-way contingency tables within the framework of the product multiple hypergeometric probability model. Primary attention is directed at the relationship between two of the variables, controlling for the effects of a covariable. This approach is essentially a multivariate extension of the Cochran/Mantel-Haenszel test to sets of (s x r) tables. A set of scores such as uniform, ridits, or probits can be assigned to categories which are ordinally scaled. In particular, if ridit scores with midranks assigned for ties are utilized, this procedure is equivalent to a partial Kruskal-Wallis test when one variable is ordinally scaled, and is equivalent to a partial Spearman rank correlation test when both variables are ordinally scaled.  相似文献   

20.
Assessing the agreement between two or more raters is an important topic in medical practice. Existing techniques, which deal with categorical data, are based on contingency tables. This is often an obstacle in practice as we have to wait for a long time to collect the appropriate sample size of subjects to construct the contingency table. In this paper, we introduce a nonparametric sequential test for assessing agreement, which can be applied as data accrues, does not require a contingency table, facilitating a rapid assessment of the agreement. The proposed test is based on the cumulative sum of the number of disagreements between the two raters and a suitable statistic representing the waiting time until the cumulative sum exceeds a predefined threshold. We treat the cases of testing two raters' agreement with respect to one or more characteristics and using two or more classification categories, the case where the two raters extremely disagree, and finally the case of testing more than two raters' agreement. The numerical investigation shows that the proposed test has excellent performance. Compared to the existing methods, the proposed method appears to require significantly smaller sample size with equivalent power. Moreover, the proposed method is easily generalizable and brings the problem of assessing the agreement between two or more raters and one or more characteristics under a unified framework, thus providing an easy to use tool to medical practitioners.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号