首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
In two‐stage group sequential trials with a primary and a secondary endpoint, the overall type I error rate for the primary endpoint is often controlled by an α‐level boundary, such as an O'Brien‐Fleming or Pocock boundary. Following a hierarchical testing sequence, the secondary endpoint is tested only if the primary endpoint achieves statistical significance either at an interim analysis or at the final analysis. To control the type I error rate for the secondary endpoint, this is tested using a Bonferroni procedure or any α‐level group sequential method. In comparison with marginal testing, there is an overall power loss for the test of the secondary endpoint since a claim of a positive result depends on the significance of the primary endpoint in the hierarchical testing sequence. We propose two group sequential testing procedures with improved secondary power: the improved Bonferroni procedure and the improved Pocock procedure. The proposed procedures use the correlation between the interim and final statistics for the secondary endpoint while applying graphical approaches to transfer the significance level from the primary endpoint to the secondary endpoint. The procedures control the familywise error rate (FWER) strongly by construction and this is confirmed via simulation. We also compare the proposed procedures with other commonly used group sequential procedures in terms of control of the FWER and the power of rejecting the secondary hypothesis. An example is provided to illustrate the procedures.  相似文献   

2.
The classical group sequential test procedures that were proposed by Pocock (1977) and O'Brien and Fleming (1979) rest on the assumption of equal sample sizes between the interim analyses. Regarding this it is well known that for most situations there is not a great amount of additional Type I error if monitoring is performed for unequal sample sizes between the stages. In some cases, however, problems can arise resulting in an unacceptable liberal behavior of the test procedure. In this article worst case scenarios in sample size imbalancements between the inspection times are considered. Exact critical values for the Pocock and the O'Brien and Fleming group sequential designs are derived for arbitrary and for varying but bounded sample sizes. The approach represents a reasonable alternative to the flexible method that is based on the Type I error rate spending function. The SAS syntax for performing the calculations is provided. Using these procedures, the inspection times or the sample sizes in the consecutive stages need to be chosen independently of the data observed so far.  相似文献   

3.
In a typical clinical trial, there are one or two primary endpoints, and a few secondary endpoints. When at least one primary endpoint achieves statistical significance, there is considerable interest in using results for the secondary endpoints to enhance characterization of the treatment effect. Because multiple endpoints are involved, regulators may require that the familywise type I error rate be controlled at a pre-set level. This requirement can be achieved by using "gatekeeping" methods. However, existing methods suffer from logical oddities such as allowing results for secondary endpoint(s) to impact the likelihood of success for the primary endpoint(s). We propose a novel and easy-to-implement gatekeeping procedure that is devoid of such deficiencies. A real data example and simulation results are used to illustrate efficiency gains of our method relative to existing methods.  相似文献   

4.
Hung et al. (2007) considered the problem of controlling the type I error rate for a primary and secondary endpoint in a clinical trial using a gatekeeping approach in which the secondary endpoint is tested only if the primary endpoint crosses its monitoring boundary. They considered a two-look trial and showed by simulation that the naive method of testing the secondary endpoint at full level α at the time the primary endpoint reaches statistical significance does not control the familywise error rate at level α. Tamhane et al. (2010) derived analytic expressions for familywise error rate and power and confirmed the inflated error rate of the naive approach. Nonetheless, many people mistakenly believe that the closure principle can be used to prove that the naive procedure controls the familywise error rate. The purpose of this note is to explain in greater detail why there is a problem with the naive approach and show that the degree of alpha inflation can be as high as that of unadjusted monitoring of a single endpoint.  相似文献   

5.
The Newman-Keuls (NK) procedure for testing all pairwise comparisons among a set of treatment means, introduced by Newman (1939) and in a slightly different form by Keuls (1952) was proposed as a reasonable way to alleviate the inflation of error rates when a large number of means are compared. It was proposed before the concepts of different types of multiple error rates were introduced by Tukey (1952a, b; 1953). Although it was popular in the 1950s and 1960s, once control of the familywise error rate (FWER) was accepted generally as an appropriate criterion in multiple testing, and it was realized that the NK procedure does not control the FWER at the nominal level at which it is performed, the procedure gradually fell out of favor. Recently, a more liberal criterion, control of the false discovery rate (FDR), has been proposed as more appropriate in some situations than FWER control. This paper notes that the NK procedure and a nonparametric extension controls the FWER within any set of homogeneous treatments. It proves that the extended procedure controls the FDR when there are well-separated clusters of homogeneous means and between-cluster test statistics are independent, and extensive simulation provides strong evidence that the original procedure controls the FDR under the same conditions and some dependent conditions when the clusters are not well-separated. Thus, the test has two desirable error-controlling properties, providing a compromise between FDR control with no subgroup FWER control and global FWER control. Yekutieli (2002) developed an FDR-controlling procedure for testing all pairwise differences among means, without any FWER-controlling criteria when there is more than one cluster. The empirica example in Yekutieli's paper was used to compare the Benjamini-Hochberg (1995) method with apparent FDR control in this context, Yekutieli's proposed method with proven FDR control, the Newman-Keuls method that controls FWER within equal clusters with apparent FDR control, and several methods that control FWER globally. The Newman-Keuls is shown to be intermediate in number of rejections to the FWER-controlling methods and the FDR-controlling methods in this example, although it is not always more conservative than the other FDR-controlling methods.  相似文献   

6.
The evaluation of surrogate endpoints for primary use in future clinical trials is an increasingly important research area, due to demands for more efficient trials coupled with recent regulatory acceptance of some surrogates as 'valid.' However, little consideration has been given to how a trial that utilizes a newly validated surrogate endpoint as its primary endpoint might be appropriately designed. We propose a novel Bayesian adaptive trial design that allows the new surrogate endpoint to play a dominant role in assessing the effect of an intervention, while remaining realistically cautious about its use. By incorporating multitrial historical information on the validated relationship between the surrogate and clinical endpoints, then subsequently evaluating accumulating data against this relationship as the new trial progresses, we adaptively guard against an erroneous assessment of treatment based upon a truly invalid surrogate. When the joint outcomes in the new trial seem plausible given similar historical trials, we proceed with the surrogate endpoint as the primary endpoint, and do so adaptively-perhaps stopping the trial for early success or inferiority of the experimental treatment, or for futility. Otherwise, we discard the surrogate and switch adaptive determinations to the original primary endpoint. We use simulation to test the operating characteristics of this new design compared to a standard O'Brien-Fleming approach, as well as the ability of our design to discriminate trustworthy from untrustworthy surrogates in hypothetical future trials. Furthermore, we investigate possible benefits using patient-level data from 18 adjuvant therapy trials in colon cancer, where disease-free survival is considered a newly validated surrogate endpoint for overall survival.  相似文献   

7.
In recent years, the use of adaptive design methods in clinical research and development based on accrued data has become very popular due to its flexibility and efficiency. Based on adaptations applied, adaptive designs can be classified into three categories: prospective, concurrent (ad hoc), and retrospective adaptive designs. An adaptive design allows modifications made to trial and/or statistical procedures of ongoing clinical trials. However, it is a concern that the actual patient population after the adaptations could deviate from the originally target patient population and consequently the overall type I error (to erroneously claim efficacy for an infective drug) rate may not be controlled. In addition, major adaptations of trial and/or statistical procedures of on-going trials may result in a totally different trial that is unable to address the scientific/medical questions the trial intends to answer. In this article, several commonly considered adaptive designs in clinical trials are reviewed. Impacts of ad hoc adaptations (protocol amendments), challenges in by design (prospective) adaptations, and obstacles of retrospective adaptations are described. Strategies for the use of adaptive design in clinical development of rare diseases are discussed. Some examples concerning the development of Velcade intended for multiple myeloma and non-Hodgkin's lymphoma are given. Practical issues that are commonly encountered when implementing adaptive design methods in clinical trials are also discussed.  相似文献   

8.
Using multiple historical trials with surrogate and true endpoints, we consider various models to predict the effect of treatment on a true endpoint in a target trial in which only a surrogate endpoint is observed. This predicted result is computed using (1) a prediction model (mixture, linear, or principal stratification) estimated from historical trials and the surrogate endpoint of the target trial and (2) a random extrapolation error estimated from successively leaving out each trial among the historical trials. The method applies to either binary outcomes or survival to a particular time that is computed from censored survival data. We compute a 95% confidence interval for the predicted result and validate its coverage using simulation. To summarize the additional uncertainty from using a predicted instead of true result for the estimated treatment effect, we compute its multiplier of standard error. Software is available for download.  相似文献   

9.
This article discusses specific assumptions necessary for permutation multiple tests to control the Familywise Error Rate (FWER). At issue is that, in comparing parameters of the marginal distributions of two sets of multivariate observations, validity of permutation testing is affected by all the parameters in the joint distributions of the observations. We show the surprising fact that, in the case of a linear model with i.i.d. errors such as in the analysis of Quantitative Trait Loci (QTL), this issue has no impact on control of FWER, if the test statistic is of a particular form. On the other hand, in the analysis of gene expression levels or multiple safety endpoints, unless some assumption connecting the marginal distributions of the observations to their joint distributions is made, permutation multiple tests may not control FWER.  相似文献   

10.
In the field of pharmaceutical drug development, there have been extensive discussions on the establishment of statistically significant results that demonstrate the efficacy of a new treatment with multiple co‐primary endpoints. When designing a clinical trial with such multiple co‐primary endpoints, it is critical to determine the appropriate sample size for indicating the statistical significance of all the co‐primary endpoints with preserving the desired overall power because the type II error rate increases with the number of co‐primary endpoints. We consider overall power functions and sample size determinations with multiple co‐primary endpoints that consist of mixed continuous and binary variables, and provide numerical examples to illustrate the behavior of the overall power functions and sample sizes. In formulating the problem, we assume that response variables follow a multivariate normal distribution, where binary variables are observed in a dichotomized normal distribution with a certain point of dichotomy. Numerical examples show that the sample size decreases as the correlation increases when the individual powers of each endpoint are approximately and mutually equal.  相似文献   

11.
A surrogate endpoint is an endpoint that is obtained sooner, at lower cost, or less invasively than the true endpoint for a health outcome and is used to make conclusions about the effect of intervention on the true endpoint. In this approach, each previous trial with surrogate and true endpoints contributes an estimated predicted effect of intervention on true endpoint in the trial of interest based on the surrogate endpoint in the trial of interest. These predicted quantities are combined in a simple random-effects meta-analysis to estimate the predicted effect of intervention on true endpoint in the trial of interest. Validation involves comparing the average prediction error of the aforementioned approach with (i) the average prediction error of a standard meta-analysis using only true endpoints in the other trials and (ii) the average clinically meaningful difference in true endpoints implicit in the trials. Validation is illustrated using data from multiple randomized trials of patients with advanced colorectal cancer in which the surrogate endpoint was tumor response and the true endpoint was median survival time.  相似文献   

12.
Designs incorporating more than one endpoint have become popular in drug development. One of such designs allows for incorporation of short‐term information in an interim analysis if the long‐term primary endpoint has not been yet observed for some of the patients. At first we consider a two‐stage design with binary endpoints allowing for futility stopping only based on conditional power under both fixed and observed effects. Design characteristics of three estimators: using primary long‐term endpoint only, short‐term endpoint only, and combining data from both are compared. For each approach, equivalent cut‐off point values for fixed and observed effect conditional power calculations can be derived resulting in the same overall power. While in trials stopping for futility the type I error rate cannot get inflated (it usually decreases), there is loss of power. In this study, we consider different scenarios, including different thresholds for conditional power, different amount of information available at the interim, different correlations and probabilities of success. We further extend the methods to adaptive designs with unblinded sample size reassessments based on conditional power with inverse normal method as the combination function. Two different futility stopping rules are considered: one based on the conditional power, and one from P‐values based on Z‐statistics of the estimators. Average sample size, probability to stop for futility and overall power of the trial are compared and the influence of the choice of weights is investigated.  相似文献   

13.
Valid inference in random effects meta-analysis   总被引:2,自引:0,他引:2  
The standard approach to inference for random effects meta-analysis relies on approximating the null distribution of a test statistic by a standard normal distribution. This approximation is asymptotic on k, the number of studies, and can be substantially in error in medical meta-analyses, which often have only a few studies. This paper proposes permutation and ad hoc methods for testing with the random effects model. Under the group permutation method, we randomly switch the treatment and control group labels in each trial. This idea is similar to using a permutation distribution for a community intervention trial where communities are randomized in pairs. The permutation method theoretically controls the type I error rate for typical meta-analyses scenarios. We also suggest two ad hoc procedures. Our first suggestion is to use a t-reference distribution with k-1 degrees of freedom rather than a standard normal distribution for the usual random effects test statistic. We also investigate the use of a simple t-statistic on the reported treatment effects.  相似文献   

14.
Most existing phase II clinical trial designs focus on conventional chemotherapy with binary tumor response as the endpoint. The advent of novel therapies, such as molecularly targeted agents and immunotherapy, has made the endpoint of phase II trials more complicated, often involving ordinal, nested, and coprimary endpoints. We propose a simple and flexible Bayesian optimal phase II predictive probability (OPP) design that handles binary and complex endpoints in a unified way. The Dirichlet-multinomial model is employed to accommodate different types of endpoints. At each interim, given the observed interim data, we calculate the Bayesian predictive probability of success, should the trial continue to the maximum planned sample size, and use it to make the go/no-go decision. The OPP design controls the type I error rate, maximizes power or minimizes the expected sample size, and is easy to implement, because the go/no-go decision boundaries can be enumerated and included in the protocol before the onset of the trial. Simulation studies show that the OPP design has satisfactory operating characteristics.  相似文献   

15.
The U.S. Environmental Protection Agency determined that one of the major impediments to the advancement and application of ecological risk assessment is doubt concerning appropriate assessment endpoints. The Agency's Risk Assessment Forum determined that the best solution to this problem was to define a set of generic ecological assessment endpoints (GEAEs). They are assessment endpoints that are applicable to a wide range of ecological risk assessments; because they reflect the programmatic goals of the Agency, they are applicable to a wide array of environmental issues, and they may be estimated using existing assessment tools. They are not specifically defined for individual cases; some ad hoc elaboration by users is expected. The GEAEs are not exhaustive or mandatory. Although most of the Agency's ecological decisions have been based on organism-level effects, GEAEs are also defined for populations, ecosystems, and special places.  相似文献   

16.
In a clinical trial with an active treatment and a placebo the situation may occur that two (or even more) primary endpoints may be necessary to describe the active treatment's benefit. The focus of our interest is a more specific situation with two primary endpoints in which superiority in one of them would suffice given that non-inferiority is observed in the other. Several proposals exist in the literature for dealing with this or similar problems, but prove insufficient or inadequate at a closer look (e.g. Bloch et al. (2001, 2006) or Tamhane and Logan (2002, 2004)). For example, we were unable to find a good reason why a bootstrap p-value for superiority should depend on the initially selected non-inferiority margins or on the initially selected type I error alpha. We propose a hierarchical three step procedure, where non-inferiority in both variables must be proven in the first step, superiority has to be shown by a bivariate test (e.g. Holm (1979), O'Brien (1984), Hochberg (1988), a bootstrap (Wang (1998)), or L?uter (1996)) in the second step, and then superiority in at least one variable has to be verified in the third step by a corresponding univariate test. All statistical tests are performed at the same one-sided significance level alpha. From the above mentioned bivariate superiority tests we preferred L?uter's SS test and the Holm procedure for the reason that these have been proven to control the type I error strictly, irrespective of the correlation structure among the primary variables and the sample size applied. A simulation study reveals that the performance regarding power of the bivariate test depends to a considerable degree on the correlation and on the magnitude of the expected effects of the two primary endpoints. Therefore, the recommendation of which test to choose depends on knowledge of the possible correlation between the two primary endpoints. In general, L?uter's SS procedure in step 2 shows the best overall properties, whereas Holm's procedure shows an advantage if both a positive correlation between the two variables and a considerable difference between their standardized effect sizes can be expected.  相似文献   

17.
A popular design for clinical trials assessing targeted therapies is the two-stage adaptive enrichment design with recruitment in stage 2 limited to a biomarker-defined subgroup chosen based on data from stage 1. The data-dependent selection leads to statistical challenges if data from both stages are used to draw inference on treatment effects in the selected subgroup. If subgroups considered are nested, as when defined by a continuous biomarker, treatment effect estimates in different subgroups follow the same distribution as estimates in a group-sequential trial. This result is used to obtain tests controlling the familywise type I error rate (FWER) for six simple subgroup selection rules, one of which also controls the FWER for any selection rule. Two approaches are proposed: one based on multivariate normal distributions suitable if the number of possible subgroups, k, is small, and one based on Brownian motion approximations suitable for large k. The methods, applicable in the wide range of settings with asymptotically normal test statistics, are illustrated using survival data from a breast cancer trial.  相似文献   

18.
K K Lan  J M Lachin 《Biometrics》1990,46(3):759-770
To control the Type I error probability in a group sequential procedure using the logrank test, it is important to know the information times (fractions) at the times of interim analyses conducted for purposes of data monitoring. For the logrank test, the information time at an interim analysis is the fraction of the total number of events to be accrued in the entire trial. In a maximum information trial design, the trial is concluded when a prespecified total number of events has been accrued. For such a design, therefore, the information time at each interim analysis is known. However, many trials are designed to accrue data over a fixed duration of follow-up on a specified number of patients. This is termed a maximum duration trial design. Under such a design, the total number of events to be accrued is unknown at the time of an interim analysis. For a maximum duration trial design, therefore, these information times need to be estimated. A common practice is to assume that a fixed fraction of information will be accrued between any two consecutive interim analyses, and then employ a Pocock or O'Brien-Fleming boundary. In this article, we describe an estimate of the information time based on the fraction of total patient exposure, which tends to be slightly negatively biased (i.e., conservative) if survival is exponentially distributed. We then present a numerical exploration of the robustness of this estimate when nonexponential survival applies. We also show that the Lan-DeMets (1983, Biometrika 70, 659-663) procedure for constructing group sequential boundaries with the desired level of Type I error control can be computed using the estimated information fraction, even though it may be biased. Finally, we discuss the implications of employing a biased estimate of study information for a group sequential procedure.  相似文献   

19.
Mid-study design modifications are becoming increasingly accepted in confirmatory clinical trials, so long as appropriate methods are applied such that error rates are controlled. It is therefore unfortunate that the important case of time-to-event endpoints is not easily handled by the standard theory. We analyze current methods that allow design modifications to be based on the full interim data, i.e., not only the observed event times but also secondary endpoint and safety data from patients who are yet to have an event. We show that the final test statistic may ignore a substantial subset of the observed event times. An alternative test incorporating all event times is found, where a conservative assumption must be made in order to guarantee type I error control. We examine the power of this approach using the example of a clinical trial comparing two cancer therapies.  相似文献   

20.
There is growing interest in integrated Phase I/II oncology clinical trials involving molecularly targeted agents (MTA). One of the main challenges of these trials are nontrivial dose–efficacy relationships and administration of MTAs in combination with other agents. While some designs were recently proposed for such Phase I/II trials, the majority of them consider the case of binary toxicity and efficacy endpoints only. At the same time, a continuous efficacy endpoint can carry more information about the agent's mechanism of action, but corresponding designs have received very limited attention in the literature. In this work, an extension of a recently developed information‐theoretic design for the case of a continuous efficacy endpoint is proposed. The design transforms the continuous outcome using the logistic transformation and uses an information–theoretic argument to govern selection during the trial. The performance of the design is investigated in settings of single‐agent and dual‐agent trials. It is found that the novel design leads to substantial improvements in operating characteristics compared to a model‐based alternative under scenarios with nonmonotonic dose/combination–efficacy relationships. The robustness of the design to missing/delayed efficacy responses and to the correlation in toxicity and efficacy endpoints is also investigated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号