首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In the field of pharmaceutical drug development, there have been extensive discussions on the establishment of statistically significant results that demonstrate the efficacy of a new treatment with multiple co‐primary endpoints. When designing a clinical trial with such multiple co‐primary endpoints, it is critical to determine the appropriate sample size for indicating the statistical significance of all the co‐primary endpoints with preserving the desired overall power because the type II error rate increases with the number of co‐primary endpoints. We consider overall power functions and sample size determinations with multiple co‐primary endpoints that consist of mixed continuous and binary variables, and provide numerical examples to illustrate the behavior of the overall power functions and sample sizes. In formulating the problem, we assume that response variables follow a multivariate normal distribution, where binary variables are observed in a dichotomized normal distribution with a certain point of dichotomy. Numerical examples show that the sample size decreases as the correlation increases when the individual powers of each endpoint are approximately and mutually equal.  相似文献   

2.
Summary We consider a clinical trial with a primary and a secondary endpoint where the secondary endpoint is tested only if the primary endpoint is significant. The trial uses a group sequential procedure with two stages. The familywise error rate (FWER) of falsely concluding significance on either endpoint is to be controlled at a nominal level α. The type I error rate for the primary endpoint is controlled by choosing any α‐level stopping boundary, e.g., the standard O'Brien–Fleming or the Pocock boundary. Given any particular α‐level boundary for the primary endpoint, we study the problem of determining the boundary for the secondary endpoint to control the FWER. We study this FWER analytically and numerically and find that it is maximized when the correlation coefficient ρ between the two endpoints equals 1. For the four combinations consisting of O'Brien–Fleming and Pocock boundaries for the primary and secondary endpoints, the critical constants required to control the FWER are computed for different values of ρ. An ad hoc boundary is proposed for the secondary endpoint to address a practical concern that may be at issue in some applications. Numerical studies indicate that the O'Brien–Fleming boundary for the primary endpoint and the Pocock boundary for the secondary endpoint generally gives the best primary as well as secondary power performance. The Pocock boundary may be replaced by the ad hoc boundary for the secondary endpoint with a very little loss of secondary power if the practical concern is at issue. A clinical trial example is given to illustrate the methods.  相似文献   

3.
In two‐stage group sequential trials with a primary and a secondary endpoint, the overall type I error rate for the primary endpoint is often controlled by an α‐level boundary, such as an O'Brien‐Fleming or Pocock boundary. Following a hierarchical testing sequence, the secondary endpoint is tested only if the primary endpoint achieves statistical significance either at an interim analysis or at the final analysis. To control the type I error rate for the secondary endpoint, this is tested using a Bonferroni procedure or any α‐level group sequential method. In comparison with marginal testing, there is an overall power loss for the test of the secondary endpoint since a claim of a positive result depends on the significance of the primary endpoint in the hierarchical testing sequence. We propose two group sequential testing procedures with improved secondary power: the improved Bonferroni procedure and the improved Pocock procedure. The proposed procedures use the correlation between the interim and final statistics for the secondary endpoint while applying graphical approaches to transfer the significance level from the primary endpoint to the secondary endpoint. The procedures control the familywise error rate (FWER) strongly by construction and this is confirmed via simulation. We also compare the proposed procedures with other commonly used group sequential procedures in terms of control of the FWER and the power of rejecting the secondary hypothesis. An example is provided to illustrate the procedures.  相似文献   

4.
The two‐sided Simes test is known to control the type I error rate with bivariate normal test statistics. For one‐sided hypotheses, control of the type I error rate requires that the correlation between the bivariate normal test statistics is non‐negative. In this article, we introduce a trimmed version of the one‐sided weighted Simes test for two hypotheses which rejects if (i) the one‐sided weighted Simes test rejects and (ii) both p‐values are below one minus the respective weighted Bonferroni adjusted level. We show that the trimmed version controls the type I error rate at nominal significance level α if (i) the common distribution of test statistics is point symmetric and (ii) the two‐sided weighted Simes test at level 2α controls the level. These assumptions apply, for instance, to bivariate normal test statistics with arbitrary correlation. In a simulation study, we compare the power of the trimmed weighted Simes test with the power of the weighted Bonferroni test and the untrimmed weighted Simes test. An additional result of this article ensures type I error rate control of the usual weighted Simes test under a weak version of the positive regression dependence condition for the case of two hypotheses. This condition is shown to apply to the two‐sided p‐values of one‐ or two‐sample t‐tests for bivariate normal endpoints with arbitrary correlation and to the corresponding one‐sided p‐values if the correlation is non‐negative. The Simes test for such types of bivariate t‐tests has not been considered before. According to our main result, the trimmed version of the weighted Simes test then also applies to the one‐sided bivariate t‐test with arbitrary correlation.  相似文献   

5.
Most existing phase II clinical trial designs focus on conventional chemotherapy with binary tumor response as the endpoint. The advent of novel therapies, such as molecularly targeted agents and immunotherapy, has made the endpoint of phase II trials more complicated, often involving ordinal, nested, and coprimary endpoints. We propose a simple and flexible Bayesian optimal phase II predictive probability (OPP) design that handles binary and complex endpoints in a unified way. The Dirichlet-multinomial model is employed to accommodate different types of endpoints. At each interim, given the observed interim data, we calculate the Bayesian predictive probability of success, should the trial continue to the maximum planned sample size, and use it to make the go/no-go decision. The OPP design controls the type I error rate, maximizes power or minimizes the expected sample size, and is easy to implement, because the go/no-go decision boundaries can be enumerated and included in the protocol before the onset of the trial. Simulation studies show that the OPP design has satisfactory operating characteristics.  相似文献   

6.
In a clinical trial with an active treatment and a placebo the situation may occur that two (or even more) primary endpoints may be necessary to describe the active treatment's benefit. The focus of our interest is a more specific situation with two primary endpoints in which superiority in one of them would suffice given that non-inferiority is observed in the other. Several proposals exist in the literature for dealing with this or similar problems, but prove insufficient or inadequate at a closer look (e.g. Bloch et al. (2001, 2006) or Tamhane and Logan (2002, 2004)). For example, we were unable to find a good reason why a bootstrap p-value for superiority should depend on the initially selected non-inferiority margins or on the initially selected type I error alpha. We propose a hierarchical three step procedure, where non-inferiority in both variables must be proven in the first step, superiority has to be shown by a bivariate test (e.g. Holm (1979), O'Brien (1984), Hochberg (1988), a bootstrap (Wang (1998)), or L?uter (1996)) in the second step, and then superiority in at least one variable has to be verified in the third step by a corresponding univariate test. All statistical tests are performed at the same one-sided significance level alpha. From the above mentioned bivariate superiority tests we preferred L?uter's SS test and the Holm procedure for the reason that these have been proven to control the type I error strictly, irrespective of the correlation structure among the primary variables and the sample size applied. A simulation study reveals that the performance regarding power of the bivariate test depends to a considerable degree on the correlation and on the magnitude of the expected effects of the two primary endpoints. Therefore, the recommendation of which test to choose depends on knowledge of the possible correlation between the two primary endpoints. In general, L?uter's SS procedure in step 2 shows the best overall properties, whereas Holm's procedure shows an advantage if both a positive correlation between the two variables and a considerable difference between their standardized effect sizes can be expected.  相似文献   

7.
Hung et al. (2007) considered the problem of controlling the type I error rate for a primary and secondary endpoint in a clinical trial using a gatekeeping approach in which the secondary endpoint is tested only if the primary endpoint crosses its monitoring boundary. They considered a two-look trial and showed by simulation that the naive method of testing the secondary endpoint at full level α at the time the primary endpoint reaches statistical significance does not control the familywise error rate at level α. Tamhane et al. (2010) derived analytic expressions for familywise error rate and power and confirmed the inflated error rate of the naive approach. Nonetheless, many people mistakenly believe that the closure principle can be used to prove that the naive procedure controls the familywise error rate. The purpose of this note is to explain in greater detail why there is a problem with the naive approach and show that the degree of alpha inflation can be as high as that of unadjusted monitoring of a single endpoint.  相似文献   

8.
We consider the problem of comparing two treatments on multiple endpoints where the goal is to identify the endpoints that have treatment effects, while controlling the familywise error rate. Two current approaches for this are (i) applying a global test within a closed testing procedure, and (ii) adjusting individual endpoint p‐values for multiplicity. We propose combining the two current methods. We compare the combined method with several competing methods in a simulation study. It is concluded that the combined approach maintains higher power under a variety of treatment effect configurations than the other methods and is thus more power‐robust.  相似文献   

9.
Multiple endpoints are tested to assess an overall treatment effect and also to identify which endpoints or subsets of endpoints contributed to treatment differences. The conventional p‐value adjustment methods, such as single‐step, step‐up, or step‐down procedures, sequentially identify each significant individual endpoint. Closed test procedures can also detect individual endpoints that have effects via a step‐by‐step closed strategy. This paper proposes a global‐based statistic for testing an a priori number, say, r of the k endpoints, as opposed to the conventional approach of testing one (r = 1) endpoint. The proposed test statistic is an extension of the single‐step p‐value‐based statistic based on the distribution of the smallest p‐value. The test maintains strong control of the FamilyWise Error (FWE) rate under the null hypothesis of no difference in any (sub)set of r endpoints among all possible combinations of the k endpoints. After rejecting the null hypothesis, the individual endpoints in the sets that are rejected can be tested further, using a univariate test statistic in a second step, if desired. However, the second step test only weakly controls the FWE. The proposed method is illustrated by application to a psychosis data set.  相似文献   

10.
In many phase III clinical trials, it is desirable to separately assess the treatment effect on two or more primary endpoints. Consider the MERIT-HF study, where two endpoints of primary interest were time to death and the earliest of time to first hospitalization or death (The International Steering Committee on Behalf of the MERIT-HF Study Group, 1997, American Journal of Cardiology 80[9B], 54J-58J). It is possible that treatment has no effect on death but a beneficial effect on first hospitalization time, or it has a detrimental effect on death but no effect on hospitalization. A good clinical trial design should permit early stopping as soon as the treatment effect on both endpoints becomes clear. Previous work in this area has not resolved how to stop the study early when one or more endpoints have no treatment effect or how to assess and control the many possible error rates for concluding wrong hypotheses. In this article, we develop a general methodology for group sequential clinical trials with multiple primary endpoints. This method uses a global alpha-spending function to control the overall type I error and a multiple decision rule to control error rates for concluding wrong alternative hypotheses. The method is demonstrated with two simulated examples based on the MERIT-HF study.  相似文献   

11.
Weight‐of‐evidence is the process by which multiple measurement endpoints are related to an assessment endpoint to evaluate whether significant risk of harm is posed to the environment. In this paper, a methodology is offered for reconciling or balancing multiple lines of evidence pertaining to an assessment endpoint. Weight‐of‐evidence is reflected in three characteristics of measurement endpoints: (a) the weight assigned to each measurement endpoint; (b) the magnitude of response observed in the measurement endpoint; and (c) the concurrence among outcomes of multiple measurement endpoints. First, weights are assigned to measurement endpoints based on attributes related to: (a) strength of association between assessment and measurement endpoints; (b) data quality; and (c) study design and execution. Second, the magnitude of response in the measurement endpoint is evaluated with respect to whether the measurement endpoint indicates the presence or absence of harm; as well as the magnitude. Third, concurrence among measurement endpoints is evaluated by plotting the findings of the two preceding steps on a matrix for each measurement endpoint evaluated. The matrix allows easy visual examination of agreements or divergences among measurement endpoints, facilitating interpretation of the collection of measurement endpoints with respect to the assessment endpoint. A qualitative adaptation of the weight‐of‐evidence approach is also presented.  相似文献   

12.
In a typical clinical trial, there are one or two primary endpoints, and a few secondary endpoints. When at least one primary endpoint achieves statistical significance, there is considerable interest in using results for the secondary endpoints to enhance characterization of the treatment effect. Because multiple endpoints are involved, regulators may require that the familywise type I error rate be controlled at a pre-set level. This requirement can be achieved by using "gatekeeping" methods. However, existing methods suffer from logical oddities such as allowing results for secondary endpoint(s) to impact the likelihood of success for the primary endpoint(s). We propose a novel and easy-to-implement gatekeeping procedure that is devoid of such deficiencies. A real data example and simulation results are used to illustrate efficiency gains of our method relative to existing methods.  相似文献   

13.
Case‐control studies are primary study designs used in genetic association studies. Sasieni (Biometrics 1997, 53, 1253–1261) pointed out that the allelic chi‐square test used in genetic association studies is invalid when Hardy‐Weinberg equilibrium (HWE) is violated in a combined population. It is important to know how much type I error rate is deviated from the nominal level under violated HWE. We examine bounds of type I error rate of the allelic chi‐square test. We also investigate power of the goodness‐of‐fit test for HWE which can be used as a guideline for selecting an appropriate test between the allelic chi‐square test and the modified allelic chi‐square test, the latter of which was proposed for cases of violated HWE. In small samples, power is not large enough to detect the Wright's inbreeding model of small values of inbreeding coefficient. Therefore, when the null hypothesis of HWE is barely accepted, the modified test should be considered as an alternative method. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

14.
In clinical trials, several endpoints (EPs) are often evaluated to compare treatments in some therapeutic area. Suppose that there are two EPs in a clinical trial. We propose a new set of composite hypotheses for continuous variables, taking the relative clinical importance of the EPs into account. The main hypotheses were formulated to show that a treatment is so superior to the control treatment, which is not necessarily a placebo, in one EP, that the possible non‐inferiority of the treatment by at most a certain value in the other EP can be compensated sufficiently, taking the clinical point of view into account. The maximum non‐inferiority margin of one EP might not be a biologically unimportant difference in exchange for much superiority of the other EP. This formulation leads to a new composite EP and a very simple test statistic. The intersection‐union principle was employed to derive the proposed test.  相似文献   

15.
Regarding Paper “Sample size determination in clinical trials with multiple co‐primary endpoints including mixed continuous and binary variables” by T. Sozu , T. Sugimoto , and T. Hamasaki Biometrical Journal (2012) 54 (5): 716–729 Article: http://dx.doi.org/10.1002/bimj.201100221 Authors' Reply: http://dx.doi.org/10.1002/bimj.201300032 This paper recently introduced a methodology for calculating the sample size in clinical trials with multiple mixed binary and continuous co‐primary endpoints modeled by the so‐called conditional grouped continuous model (CGCM). The purpose of this note is to clarify certain aspects of the methodology and propose an alternative approach based on latent means tests for the binary endpoints. We demonstrate that our approach is more powerful, yielding smaller sample sizes at powers comparable to those used in the paper.  相似文献   

16.
Designs incorporating more than one endpoint have become popular in drug development. One of such designs allows for incorporation of short‐term information in an interim analysis if the long‐term primary endpoint has not been yet observed for some of the patients. At first we consider a two‐stage design with binary endpoints allowing for futility stopping only based on conditional power under both fixed and observed effects. Design characteristics of three estimators: using primary long‐term endpoint only, short‐term endpoint only, and combining data from both are compared. For each approach, equivalent cut‐off point values for fixed and observed effect conditional power calculations can be derived resulting in the same overall power. While in trials stopping for futility the type I error rate cannot get inflated (it usually decreases), there is loss of power. In this study, we consider different scenarios, including different thresholds for conditional power, different amount of information available at the interim, different correlations and probabilities of success. We further extend the methods to adaptive designs with unblinded sample size reassessments based on conditional power with inverse normal method as the combination function. Two different futility stopping rules are considered: one based on the conditional power, and one from P‐values based on Z‐statistics of the estimators. Average sample size, probability to stop for futility and overall power of the trial are compared and the influence of the choice of weights is investigated.  相似文献   

17.
The functional importance of bacteria and fungi in terrestrial systems is recognized widely. However, microbial population, community, and functional measurement endpoints change rapidly and across very short spatial scales. Measurement endpoints of microbes tend to be highly responsive to typical fluxes of temperature, moisture, oxygen, and many other noncontaminant factors. Functional redundancy across broad taxonomic groups enables wild swings in community composition without remarkable change in rates of decomposition or community respiration. Consequently, it is exceedingly difficult to relate specific microbial activities with indications of adverse and unacceptable environmental conditions. Moreover, changes in microbial processes do not necessarily result in consequences to plant and animal populations or communities, which in the end are the resources most commonly identified as those to be protected. Therefore, unless more definitive linkages are made between specific microbial effects and an adverse condition for typical assessment endpoint species, microbial endpoints will continue to have limited use in risk assessments; they will not drive the process as primary assessment endpoints.  相似文献   

18.
The different proteins of any proteome evolve at enormously different rates. One of the primary factors influencing rates of protein evolution is expression level, with highly expressed proteins tending to evolve at slow rates. This phenomenon, known as the expression level–evolutionary rate (E–R) anticorrelation, has been attributed to the abundance‐dependent deleterious effects of misfolding or misinteraction. We have recently shown that secreted proteins either lack an E–R anticorrelation or exhibit a significantly reduced E–R anticorrelation. This effect may be due to the strict quality control to which secreted proteins are subject in the endoplasmic reticulum (which is expected to reduce the rate of misfolding and its deleterious effects) or to their extracellular location (expected to reduce the rate of misinteraction and its deleterious effects). Among secreted proteins, N‐glycosylated ones are under particularly strong quality control. Here, we investigate how N‐linked glycosylation affects the E–R anticorrelation. Strikingly, we observe a positive E–R correlation among N‐glycosylated proteins. That is, N‐glycoproteins that are highly expressed evolve at faster rates than lowly expressed N‐glycoproteins, in contrast to what is observed among intracellular proteins.  相似文献   

19.
Directly standardized rates continue to be an integral tool for presenting rates for diseases that are highly dependent on age, such as cancer. Statistically, these rates are modeled as a weighted sum of Poisson random variables. This is a difficult statistical problem, because there are k observed Poisson variables and k unknown means. The gamma confidence interval has been shown through simulations to have at least nominal coverage in all simulated scenarios, but it can be overly conservative. Previous modifications to that method have closer to nominal coverage on average, but they do not achieve the nominal coverage bound in all situations. Further, those modifications are not central intervals, and the upper coverage error rate can be substantially more than half the nominal error. Here we apply a mid‐p modification to the gamma confidence interval. Typical mid‐p methods forsake guaranteed coverage to get coverage that is sometimes higher and sometimes lower than the nominal coverage rate, depending on the values of the parameters. The mid‐p gamma interval does not have guaranteed coverage in all situations; however, in the (not rare) situations where the gamma method is overly conservative, the mid‐p gamma interval often has at least nominal coverage. The mid‐p gamma interval is especially appropriate when one wants a central interval, since simulations show that in many situations both the upper and lower coverage error rates are on average less than or equal to half the nominal error rate.  相似文献   

20.
The development of oncology drugs progresses through multiple phases, where after each phase, a decision is made about whether to move a molecule forward. Early phase efficacy decisions are often made on the basis of single-arm studies based on a set of rules to define whether the tumor improves (“responds”), remains stable, or progresses (response evaluation criteria in solid tumors [RECIST]). These decision rules are implicitly assuming some form of surrogacy between tumor response and long-term endpoints like progression-free survival (PFS) or overall survival (OS). With the emergence of new therapies, for which the link between RECIST tumor response and long-term endpoints is either not accessible yet, or the link is weaker than with classical chemotherapies, tumor response-based rules may not be optimal. In this paper, we explore the use of a multistate model for decision-making based on single-arm early phase trials. The multistate model allows to account for more information than the simple RECIST response status, namely, the time to get to response, the duration of response, the PFS time, and time to death. We propose to base the decision on efficacy on the OS hazard ratio (HR) comparing historical control to data from the experimental treatment, with the latter predicted from a multistate model based on early phase data with limited survival follow-up. Using two case studies, we illustrate feasibility of the estimation of such an OS HR. We argue that, in the presence of limited follow-up and small sample size, and making realistic assumptions within the multistate model, the OS prediction is acceptable and may lead to better early decisions within the development of a drug.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号