首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 408 毫秒
1.
An exact trend test for correlated binary data   总被引:1,自引:0,他引:1  
The problem of testing a dose-response relationship in the presence of exchangeably correlated binary data has been addressed using a variety of models. Most commonly used approaches are derived from likelihood or generalized estimating equations and rely on large-sample theory to justify their inferences. However, while earlier work has determined that these methods may perform poorly for small or sparse samples, there are few alternatives available to those faced with such data. We propose an exact trend test for exchangeably correlated binary data when groups of correlated observations are ordered. This exact approach is based on an exponential model derived by Molenberghs and Ryan (1999) and Ryan and Molenberghs (1999) and provides natural analogues to Fisher's exact test and the binomial trend test when the data are correlated. We use a graphical method with which one can efficiently compute the exact tail distribution and apply the test to two examples.  相似文献   

2.
There is growing interest in conducting cluster randomized trials (CRTs). For simplicity in sample size calculation, the cluster sizes are assumed to be identical across all clusters. However, equal cluster sizes are not guaranteed in practice. Therefore, the relative efficiency (RE) of unequal versus equal cluster sizes has been investigated when testing the treatment effect. One of the most important approaches to analyze a set of correlated data is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which the “working correlation structure” is introduced and the association pattern depends on a vector of association parameters denoted by ρ. In this paper, we utilize GEE models to test the treatment effect in a two‐group comparison for continuous, binary, or count data in CRTs. The variances of the estimator of the treatment effect are derived for the different types of outcome. RE is defined as the ratio of variance of the estimator of the treatment effect for equal to unequal cluster sizes. We discuss a commonly used structure in CRTs—exchangeable, and derive the simpler formula of RE with continuous, binary, and count outcomes. Finally, REs are investigated for several scenarios of cluster size distributions through simulation studies. We propose an adjusted sample size due to efficiency loss. Additionally, we also propose an optimal sample size estimation based on the GEE models under a fixed budget for known and unknown association parameter (ρ) in the working correlation structure within the cluster.  相似文献   

3.
This paper considers the use of a multivariate binomial probit model for the analysis of correlated exchangeable binary data. The model can naturally accommodate both cluster and individual level covariates, while keeping a fairly flexible intracluster association structure. We discuss Bayesian estimation when a sample of independent clusters of varying sizes are available, and show how Gibbs sampling may be used to derive the posterior densities of parameters. The methodology is illustrated with two examples: the first involves epidemiological data from a study of familial disease aggregation; the second uses teratological data from a developmental toxicity application.  相似文献   

4.
A covariance estimator for GEE with improved small-sample properties   总被引:2,自引:0,他引:2  
Mancl LA  DeRouen TA 《Biometrics》2001,57(1):126-134
In this paper, we propose an alternative covariance estimator to the robust covariance estimator of generalized estimating equations (GEE). Hypothesis tests using the robust covariance estimator can have inflated size when the number of independent clusters is small. Resampling methods, such as the jackknife and bootstrap, have been suggested for covariance estimation when the number of clusters is small. A drawback of the resampling methods when the response is binary is that the methods can break down when the number of subjects is small due to zero or near-zero cell counts caused by resampling. We propose a bias-corrected covariance estimator that avoids this problem. In a small simulation study, we compare the bias-corrected covariance estimator to the robust and jackknife covariance estimators for binary responses for situations involving 10-40 subjects with equal and unequal cluster sizes of 16-64 observations. The bias-corrected covariance estimator gave tests with sizes close to the nominal level even when the number of subjects was 10 and cluster sizes were unequal, whereas the robust and jackknife covariance estimators gave tests with sizes that could be 2-3 times the nominal level. The methods are illustrated using data from a randomized clinical trial on treatment for bone loss in subjects with periodontal disease.  相似文献   

5.
In a recent paper, MARASCUILO [19] has provided an asymptotic solution to the important question on how to test for differences in change parameters when paired observation of binary type (+, -) have been made on two or more independent samples of individuals. In this article, an alternative approach is presented implying asymptotic as well as exact tests for changes. They are based on pre-post test designs from clinical research and allow for controlled evaluation of one treatment modality as well as for comparing 2 or more than 2 treatment modalities. The rationale of the tests is based on McNEMARS [21] test for paired binary observations in one, two, or k samples.  相似文献   

6.
Hoff PD 《Biometrics》2005,61(4):1027-1036
This article develops a model-based approach to clustering multivariate binary data, in which the attributes that distinguish a cluster from the rest of the population may depend on the cluster being considered. The clustering approach is based on a multivariate Dirichlet process mixture model, which allows for the estimation of the number of clusters, the cluster memberships, and the cluster-specific parameters in a unified way. Such a clustering approach has applications in the analysis of genomic abnormality data, in which the development of different types of tumors may depend on the presence of certain abnormalities at subsets of locations along the genome. Additionally, such a mixture model provides a nonparametric estimation scheme for dependent sequences of binary data.  相似文献   

7.

Background

The theory has been put forward that if a null hypothesis is true, P-values should follow a Uniform distribution. This can be used to check the validity of randomisation.

Method

The theory was tested by simulation for two sample t tests for data from a Normal distribution and a Lognormal distribution, for two sample t tests which are not independent, and for chi-squared and Fisher’s exact test using small and using large samples.

Results

For the two sample t test with Normal data the distribution of P-values was very close to the Uniform. When using Lognormal data this was no longer true, and the distribution had a pronounced mode. For correlated tests, even using data from a Normal distribution, the distribution of P-values varied from simulation run to simulation run, but did not look close to Uniform in any realisation. For binary data in a small sample, only a few probabilities were possible and distribution was very uneven. With a sample of two groups of 1,000 observations, there was great unevenness in the histogram and a poor fit to the Uniform.

Conclusions

The notion that P-values for comparisons of groups using baseline data in randomised clinical trials should follow a Uniform distribution if the randomisation is valid has been found to be true only in the context of independent variables which follow a Normal distribution, not for Lognormal data, correlated variables, or binary data using either chi-squared or Fisher’s exact tests. This should not be used as a check for valid randomisation.  相似文献   

8.
Pang Z  Kuk AY 《Biometrics》2007,63(1):218-227
Exchangeable binary data are often collected in developmental toxicity and other studies, and a whole host of parametric distributions for fitting this kind of data have been proposed in the literature. While these distributions can be matched to have the same marginal probability and intra-cluster correlation, they can be quite different in terms of shape and higher-order quantities of interest such as the litter-level risk of having at least one malformed fetus. A sensible alternative is to fit a saturated model (Bowman and George, 1995, Journal of the American Statistical Association 90, 871-879) using the expectation-maximization (EM) algorithm proposed by Stefanescu and Turnbull (2003, Biometrics 59, 18-24). The assumption of compatibility of marginal distributions is often made to link up the distributions for different cluster sizes so that estimation can be based on the combined data. Stefanescu and Turnbull proposed a modified trend test to test this assumption. Their test, however, fails to take into account the variability of an estimated null expectation and as a result leads to inaccurate p-values. This drawback is rectified in this article. When the data are sparse, the probability function estimated using a saturated model can be very jagged and some kind of smoothing is needed. We extend the penalized likelihood method (Simonoff, 1983, Annals of Statistics 11, 208-218) to the present case of unequal cluster sizes and implement the method using an EM-type algorithm. In the presence of covariate, we propose a penalized kernel method that performs smoothing in both the covariate and response space. The proposed methods are illustrated using several data sets and the sampling and robustness properties of the resulting estimators are evaluated by simulations.  相似文献   

9.
The neurotoxicity of a substance is often tested using animal bioassays. In the functional observational battery, animals are exposed to a test agent and multiple outcomes are recorded to assess toxicity, using approximately 40 animals measured on up to 30 different items. This design gives rise to a challenging statistical problem: a large number of outcomes for a small sample of subjects. We propose an exact test for multiple binary outcomes, under the assumption that the correlation among these items is equal. This test is based upon an exponential model described by Molenberghs and Ryan (1999, Environmetrics 10, 279-300) and extends the methods developed by Corcoran et al. (2001, Biometrics 57, 941-948) who developed an exact test for exchangeably correlated binary data for groups (clusters) of correlated observations. We present a method that computes an exact p-value testing for a joint dose-response relationship. An estimate of the parameter for dose response is also determined along with its 95% confidence bound. The method is illustrated using data from a neurotoxicity bioassay for the chemical perchlorethylene.  相似文献   

10.
Heinze G  Gnant M  Schemper M 《Biometrics》2003,59(4):1151-1157
The asymptotic log-rank and generalized Wilcoxon tests are the standard procedures for comparing samples of possibly censored survival times. For comparison of samples of very different sizes, an exact test is available that is based on a complete permutation of log-rank or Wilcoxon scores. While the asymptotic tests do not keep their nominal sizes if sample sizes differ substantially, the exact complete permutation test requires equal follow-up of the samples. Therefore, we have developed and present two new exact tests also suitable for unequal follow-up. The first of these is an exact analogue of the asymptotic log-rank test and conditions on observed risk sets, whereas the second approach permutes survival times while conditioning on the realized follow-up in each group. In an empirical study, we compare the new procedures with the asymptotic log-rank test, the exact complete permutation test, and an earlier proposed approach that equalizes the follow-up distributions using artificial censoring. Results confirm highly satisfactory performance of the exact procedure conditioning on realized follow-up, particularly in case of unequal follow-up. The advantage of this test over other options of analysis is finally exemplified in the analysis of a breast cancer study.  相似文献   

11.
Recently several papers have been published that deal with the construction of exact unconditional tests for non-inferiority and confidence intervals based on the approximative unconditional restricted maximum likelihood test for two binomial random variables. Soon after the papers have been published the commercially available software for exact tests StatXact has incorporated the new methods. There are however gaps in the proofs which since have not been resolved adequately. Further it turned out that the methods for testing non-inferiority are not coherent and test for non-inferiority can easily come to different conclusions compared to the confidence interval inclusion rule. In this paper, a proposal is made how to resolve the open problems. Berger and Boos (1994) developed the confidence interval method for testing equality of two proportions. StatXact (Version 5) has extended this method for shifted hypotheses. It is shown that at least for unbalanced designs (i.e. largely different sample sizes) the Berger and Boos method can lead to controversial results.  相似文献   

12.
Significance testing for correlated binary outcome data   总被引:1,自引:0,他引:1  
B Rosner  R C Milton 《Biometrics》1988,44(2):505-512
Multiple logistic regression is a commonly used multivariate technique for analyzing data with a binary outcome. One assumption needed for this method of analysis is the independence of outcome for all sample points in a data set. In ophthalmologic data and other types of correlated binary data, this assumption is often grossly violated and the validity of the technique becomes an issue. A technique has been developed (Rosner, 1984) that utilizes a polychotomous logistic regression model to allow one to look at multiple exposure variables in the context of a correlated binary data structure. This model is an extension of the beta-binomial model, which has been widely used to model correlated binary data when no covariates are present. In this paper, a relationship is developed between the two techniques, whereby it is shown that use of ordinary logistic regression in the presence of correlated binary data can result in true significance levels that are considerably larger than nominal levels in frequently encountered situations. This relationship is explored in detail in the case of a single dichotomous exposure variable. In this case, the appropriate test statistic can be expressed as an adjusted chi-square statistic based on the 2 X 2 contingency table relating exposure to outcome. The test statistic is easily computed as a function of the ordinary chi-square statistic and the correlation between eyes (or more generally between cluster members) for outcome and exposure, respectively. This generalizes some previous results obtained by Koval and Donner (1987, in Festschrift for V. M. Joshi, I. B. MacNeill (ed.), Vol. V, 199-224.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

13.
The accumulation of DNA microarray data has now made it possible to use gene expression profiles to analyse expression data. A gene expression profile contains the expression data for a given gene over various samples, and can be contrasted with an expression signature, which contains the expression data for a single sample. Gene expression profiles are most revealing when samples are grouped appropriately, either by standard clinical or pathological categories or by categories discovered through cluster analysis techniques. Expression profiles can exist at various levels of abstraction, yielding information across various tissues or across diseases within a particular tissue. Hypothesis tests may be applied to expression profiles on a large scale to identify candidate genes of interest.  相似文献   

14.
Donohue MC  Overholser R  Xu R  Vaida F 《Biometrika》2011,98(3):685-700
We study model selection for clustered data, when the focus is on cluster specific inference. Such data are often modelled using random effects, and conditional Akaike information was proposed in Vaida & Blanchard (2005) and used to derive an information criterion under linear mixed models. Here we extend the approach to generalized linear and proportional hazards mixed models. Outside the normal linear mixed models, exact calculations are not available and we resort to asymptotic approximations. In the presence of nuisance parameters, a profile conditional Akaike information is proposed. Bootstrap methods are considered for their potential advantage in finite samples. Simulations show that the performance of the bootstrap and the analytic criteria are comparable, with bootstrap demonstrating some advantages for larger cluster sizes. The proposed criteria are applied to two cancer datasets to select models when the cluster-specific inference is of interest.  相似文献   

15.
E J Stanek  S R Diehl 《Biometrics》1988,44(4):973-983
Experimental designs that include repeated measures of binary response variables over time and under different conditions are common in biology. In such settings, it is often desirable to characterize the response pattern over time. When response variables are continuous, this characterization can be made in terms of a growth model such as the Potthoff-Roy growth curve model. We illustrate how a similar growth curve modeling strategy can be implemented using weighted least squares (WLS) methods for binary response data. The growth models are constructed in terms of polynomial functions across marginal response. However, when growth models are fit to repeated binary response, the nonsignificant higher-order polynomial functions are dropped from the model, rather than used as covariates. Dropping the nonsignificant polynomials from the model will reduce the number of response functions, and help avoid small-sample problems that can occur when the number of correlated response functions is large and sample sizes are small. The reduced set of response functions are then modeled using WLS methods. We illustrate such models with an example of binary fly oviposition response (accept or reject) exhibited by two populations of flies at four ages to two types of fruit.  相似文献   

16.
Chan IS  Tang NS  Tang ML  Chan PS 《Biometrics》2003,59(4):1170-1177
Testing of noninferiority has become increasingly important in modern medicine as a means of comparing a new test procedure to a currently available test procedure. Asymptotic methods have recently been developed for analyzing noninferiority trials using rate ratios under the matched-pair design. In small samples, however, the performance of these asymptotic methods may not be reliable, and they are not recommended. In this article, we investigate alternative methods that are desirable for assessing noninferiority trials, using the rate ratio measure under small-sample matched-pair designs. In particular, we propose an exact and an approximate exact unconditional test, along with the corresponding confidence intervals based on the score statistic. The exact unconditional method guarantees the type I error rate will not exceed the nominal level. It is recommended for when strict control of type I error (protection against any inflated risk of accepting inferior treatments) is required. However, the exact method tends to be overly conservative (thus, less powerful) and computationally demanding. Via empirical studies, we demonstrate that the approximate exact score method, which is computationally simple to implement, controls the type I error rate reasonably well and has high power for hypothesis testing. On balance, the approximate exact method offers a very good alternative for analyzing correlated binary data from matched-pair designs with small sample sizes. We illustrate these methods using two real examples taken from a crossover study of soft lenses and a Pneumocystis carinii pneumonia study. We contrast the methods with a hypothetical example.  相似文献   

17.
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one‐step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster‐deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts.  相似文献   

18.
McNemar's test is used to assess the difference between two different procedures (treatments) using independent matched-pair data. For matched-pair data collected in clusters, the tests proposed by Durkalski et al. and Obuchowski are popular and commonly used in practice since these tests do not require distributional assumptions or assumptions on the structure of the within-cluster correlation of the data. Motivated by these tests, this note proposes a modified Obuchowski test and illustrates comparisons of the proposed test with the extant methods. An extensive Monte Carlo simulation study suggests that the proposed test performs well with respect to the nominal size, and has higher power; Obuchowski's test is most conservative, and the performance of the Durkalski's test varies between the modified Obuchowski test and the original Obuchowski's test. These results form the basis for our recommendation that (i) for equal cluster size, the modified Obuchowski test is always preferred; (ii) for varying cluster size Durkalski's test can be used for a small number of clusters (e.g. K < 50), whereas for a large number of clusters (e.g. K ≥ 50) the modified Obuchowski test is preferred. Finally, to illustrate practical application of the competing tests, two real collections of clustered matched-pair data are analyzed.  相似文献   

19.
B Rosner 《Biometrics》1992,48(3):721-731
Clustered binary data occur frequently in biostatistical work. Several approaches have been proposed for the analysis of clustered binary data. In Rosner (1984, Biometrics 40, 1025-1035), a polychotomous logistic regression model was proposed that is a generalization of the beta-binomial distribution and allows for unit- and subunit-specific covariates, while controlling for clustering effects. One assumption of this model is that all pairs of subunits within a cluster are equally correlated. This is appropriate for ophthalmologic work where clusters are generally of size 2, but may be inappropriate for larger cluster sizes. A beta-binomial mixture model is introduced to allow for multiple subclasses within a cluster and to estimate odds ratios relating outcomes for pairs of subunits within a subclass as well as in different subclasses. To include covariates, an extension of the polychotomous logistic regression model is proposed, which allows one to estimate effects of unit-, class-, and subunit-specific covariates, while controlling for clustering using the beta-binomial mixture model. This model is applied to the analysis of respiratory symptom data in children collected over a 14-year period in East Boston, Massachusetts, in relation to maternal and child smoking, where the unit is the child and symptom history is divided into early-adolescent and late-adolescent symptom experience.  相似文献   

20.
Summary In some biomedical studies involving clustered binary responses (say, disease status), the cluster sizes can vary because some components of the cluster can be absent. When both the presence of a cluster component as well as the binary disease status of a present component are treated as responses of interest, we propose a novel two‐stage random effects logistic regression framework. For the ease of interpretation of regression effects, both the marginal probability of presence/absence of a component as well as the conditional probability of disease status of a present component, preserve the approximate logistic regression forms. We present a maximum likelihood method of estimation implementable using standard statistical software. We compare our models and the physical interpretation of regression effects with competing methods from literature. We also present a simulation study to assess the robustness of our procedure to wrong specification of the random effects distribution and to compare finite‐sample performances of estimates with existing methods. The methodology is illustrated via analyzing a study of the periodontal health status in a diabetic Gullah population.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号