共查询到20条相似文献,搜索用时 15 毫秒
1.
Joachim D. Pleil 《Biomarkers》2016,21(3):195-199
This commentary is the second of a series outlining one specific concept in interpreting biomarkers data. In the first, an observational method was presented for assessing the distribution of measurements before making parametric calculations. Here, the discussion revolves around the next step, the choice of using standard error of the mean or the calculated standard deviation to compare or predict measurement results 相似文献
2.
3.
Samuel T. Hsiao Lingyun Liu Cyrus R. Mehta 《Biometrical journal. Biometrische Zeitschrift》2019,61(5):1175-1186
Clinical trials with adaptive sample size reassessment based on an unblinded analysis of interim results are perhaps the most popular class of adaptive designs (see Elsäßer et al., 2007). Such trials are typically designed by prespecifying a zone for the interim test statistic, termed the promising zone, along with a decision rule for increasing the sample size within that zone. Mehta and Pocock (2011) provided some examples of promising zone designs and discussed several procedures for controlling their type‐1 error. They did not, however, address how to choose the promising zone or the corresponding sample size reassessment rule, and proposed instead that the operating characteristics of alternative promising zone designs could be compared by simulation. Jennison and Turnbull (2015) developed an approach based on maximizing expected utility whereby one could evaluate alternative promising zone designs relative to a gold‐standard optimal design. In this paper, we show how, by eliciting a few preferences from the trial sponsor, one can construct promising zone designs that are both intuitive and achieve the Jennison and Turnbull (2015) gold‐standard for optimality. 相似文献
4.
Xin Wang Tu Xu Sheng Zhong Yijie Zhou Lu Cui 《Biometrical journal. Biometrische Zeitschrift》2019,61(3):769-778
In clinical trials, sample size reestimation is a useful strategy for mitigating the risk of uncertainty in design assumptions and ensuring sufficient power for the final analysis. In particular, sample size reestimation based on unblinded interim effect size can often lead to sample size increase, and statistical adjustment is usually needed for the final analysis to ensure that type I error rate is appropriately controlled. In current literature, sample size reestimation and corresponding type I error control are discussed in the context of maintaining the original randomization ratio across treatment groups, which we refer to as “proportional increase.” In practice, not all studies are designed based on an optimal randomization ratio due to practical reasons. In such cases, when sample size is to be increased, it is more efficient to allocate the additional subjects such that the randomization ratio is brought closer to an optimal ratio. In this research, we propose an adaptive randomization ratio change when sample size increase is warranted. We refer to this strategy as “nonproportional increase,” as the number of subjects increased in each treatment group is no longer proportional to the original randomization ratio. The proposed method boosts power not only through the increase of the sample size, but also via efficient allocation of the additional subjects. The control of type I error rate is shown analytically. Simulations are performed to illustrate the theoretical results. 相似文献
5.
We introduce sequential testing procedures for the planning and analysis of reliability studies to assess an exposure's measurement error. The designs allow repeated evaluation of reliability of the measurements and stop testing if early evidence shows the measurement error is within the level of tolerance. Methods are developed and critical values tabulated for a number of two-stage designs. The methods are exemplified using an example evaluating the reliability of biomarkers associated with oxidative stress. 相似文献
6.
Karla Moreno‐Torres Barbara Wolfe William Saville Rebecca Garabed 《Ecology and evolution》2016,6(7):2216-2225
Prevalence of disease in wildlife populations, which is necessary for developing disease models and conducting epidemiologic analyses, is often understudied. Laboratory tests used to screen for diseases in wildlife populations often are validated only for domestic animals. Consequently, the use of these tests for wildlife populations may lead to inaccurate estimates of disease prevalence. We demonstrate the use of Bayesian latent class analysis (LCA) in determining the specificity and sensitivity of a competitive enzyme‐linked immunosorbent assay (cELISA; VMRD®, Inc.) serologic test used to identify exposure to Neospora caninum (hereafter N. caninum) in three wildlife populations in southeastern Ohio, USA. True prevalence of N. caninum exposure in these populations was estimated to range from 0.1% to 3.1% in American bison (Bison bison), 51.0% to 53.8% in Père David's deer (Elaphurus davidianus), and 40.0% to 45.9% in white‐tailed deer (Odocoileus virginianus). The accuracy of the cELISA in American bison and Père David's deer was estimated to be close to the 96% sensitivity and 99% specificity reported by the manufacturer. Sensitivity in white‐tailed deer, however, ranged from 78.9% to 99.9%. Apparent prevalence of N. caninum from the test results is not equal to the true prevalence in white‐tailed deer and Père David's deer populations. Even when these species inhabit the same community, the true prevalence in the two deer populations differed from the true prevalence in the American bison population. Variances in prevalence for some species suggest differences in the epidemiology of N. caninum for these colocated populations. Bayesian LCA methods could be used as in this example to overcome some of the constraints on validating tests in wildlife species. The ability to accurately evaluate disease status and prevalence in a population improves our understanding of the epidemiology of multihost pathogen systems at the community level. 相似文献
7.
8.
Terrestrial tardigrades are often found in the lichens and mosses growing on trees and rocks. The assertion that tardigrades
in these habitats are very patchy in their distribution has rarely been backed by quantitative sampling. This study assesses
spatial variability in tardigrade populations inhabiting small patches (0.1 cm2 to over 5 cm2) of moss and lichen on trees and rocks at three sites in the United States of America. Tardigrades were collected from four
replicate rocks in the Ouachita Mountains of Arkansas, with 30 lichen patches collected on two adjacent boulders and 20 moss
patches on a second pair of boulders. In Fort Myers and in Citrus Springs, Florida, 30 lichen patches per tree were collected
from two pairs of trees. The tardigrades in each sample were extracted, mounted, identified, and counted. The variation in
tardigrade abundance among lichen or moss patches within rocks or trees was very high; the only consistent pattern was that
very small patches usually lacked tardigrades. Tardigrade diversity and abundance also varied greatly within sites when lichens
and mosses of the same species from different rocks and trees were compared (in the most extreme case one tree had numerous
individuals of two tardigrade species present while the other had almost no tardigrades). The results of this quantitative
sampling support the assertion that tardigrades are very patchy in distribution. Given the considerable time investment required
for the quantitative processing of tardigrade samples, this high spatial variability in tardigrade diversity and abundance
requires that researches testing ecological hypotheses about tardigrade abundance check variability before deciding how many
samples to take. 相似文献
9.
MaxEnt模型是过去几年最为流行的物种分布预测模型之一。针对一些濒危物种、入侵种和模拟数据的研究表明,MaxEnt模型均能在小样本的分布数据下得到较准确的预测结果。此外,研究范围的变化也会影响MaxEnt模型的构建。 然而,基于动物的实际分布数据来评估MaxEnt模型的研究甚少。 我们以黑白仰鼻猴 (Rhinopithecus bieti)为例,以11个猴群的分布数据为训练数据(样本量从1到10个猴群),在不同研究范围内构建MaxEnt模型,通过其它5个的猴群分布数据验证,分析样本量和研究范围变化对模型准确度产生的影响。 结果表明,随样本量和研究范围增大,MaxEnt模型准确度及稳定性都有增加。 此外,研究范围变化对模型准确度有一定影响。 应用Maxent进行物种分布预测时,训练数据应尽可能涵盖该物种可能出现的全部环境梯度。构建模型所需的背景数据点选择,应与建模使用的物种出现点形成有效对照。 相似文献
10.
M. S. Wisz R. J. Hijmans J. Li A. T. Peterson C. H. Graham A. Guisan NCEAS Predicting Species Distributions Working Group† 《Diversity & distributions》2008,14(5):763-773
A wide range of modelling algorithms is used by ecologists, conservation practitioners, and others to predict species ranges from point locality data. Unfortunately, the amount of data available is limited for many taxa and regions, making it essential to quantify the sensitivity of these algorithms to sample size. This is the first study to address this need by rigorously evaluating a broad suite of algorithms with independent presence–absence data from multiple species and regions. We evaluated predictions from 12 algorithms for 46 species (from six different regions of the world) at three sample sizes (100, 30, and 10 records). We used data from natural history collections to run the models, and evaluated the quality of model predictions with area under the receiver operating characteristic curve (AUC). With decreasing sample size, model accuracy decreased and variability increased across species and between models. Novel modelling methods that incorporate both interactions between predictor variables and complex response shapes (i.e. GBM, MARS-INT, BRUTO) performed better than most methods at large sample sizes but not at the smallest sample sizes. Other algorithms were much less sensitive to sample size, including an algorithm based on maximum entropy (MAXENT) that had among the best predictive power across all sample sizes. Relative to other algorithms, a distance metric algorithm (DOMAIN) and a genetic algorithm (OM-GARP) had intermediate performance at the largest sample size and among the best performance at the lowest sample size. No algorithm predicted consistently well with small sample size ( n < 30) and this should encourage highly conservative use of predictions based on small sample size and restrict their use to exploratory modelling. 相似文献
11.
A central goal in designing clinical trials is to find the test that maximizes power (or equivalently minimizes required sample size) for finding a false null hypothesis subject to the constraint of type I error. When there is more than one test, such as in clinical trials with multiple endpoints, the issues of optimal design and optimal procedures become more complex. In this paper, we address the question of how such optimal tests should be defined and how they can be found. We review different notions of power and how they relate to study goals, and also consider the requirements of type I error control and the nature of the procedures. This leads us to an explicit optimization problem with objective and constraints that describe its specific desiderata. We present a complete solution for deriving optimal procedures for two hypotheses, which have desired monotonicity properties, and are computationally simple. For some of the optimization formulations this yields optimal procedures that are identical to existing procedures, such as Hommel's procedure or the procedure of Bittman et al. (2009), while for other cases it yields completely novel and more powerful procedures than existing ones. We demonstrate the nature of our novel procedures and their improved power extensively in a simulation and on the APEX study (Cohen et al., 2016). 相似文献
12.
The increasing interest in subpopulation analysis has led to the development of various new trial designs and analysis methods in the fields of personalized medicine and targeted therapies. In this paper, subpopulations are defined in terms of an accumulation of disjoint population subsets and will therefore be called composite populations. The proposed trial design is applicable to any set of composite populations, considering normally distributed endpoints and random baseline covariates. Treatment effects for composite populations are tested by combining p-values, calculated on the subset levels, using the inverse normal combination function to generate test statistics for those composite populations while the closed testing procedure accounts for multiple testing. Critical boundaries for intersection hypothesis tests are derived using multivariate normal distributions, reflecting the joint distribution of composite population test statistics given no treatment effect exists. For sample size calculation and sample size, recalculation multivariate normal distributions are derived which describe the joint distribution of composite population test statistics under an assumed alternative hypothesis. Simulations demonstrate the absence of any practical relevant inflation of the type I error rate. The target power after sample size recalculation is typically met or close to being met. 相似文献
13.
Philip M. Westgate 《Biometrical journal. Biometrische Zeitschrift》2013,55(5):789-806
Group randomized trials (GRTs) randomize groups, or clusters, of people to intervention or control arms. To test for the effectiveness of the intervention when subject‐level outcomes are binary, and while fitting a marginal model that adjusts for cluster‐level covariates and utilizes a logistic link, we develop a pseudo‐Wald statistic to improve inference. Alternative Wald statistics could employ bias‐corrected empirical sandwich standard error estimates, which have received limited attention in the GRT literature despite their broad utility and applicability in our settings of interest. The test could also be carried out using popular approaches based upon cluster‐level summary outcomes. A simulation study covering a variety of realistic GRT settings is used to compare the accuracy of these methods in terms of producing nominal test sizes. Tests based upon the pseudo‐Wald statistic and a cluster‐level summary approach utilizing the natural log of observed cluster‐level odds worked best. Due to weighting, some popular cluster‐level summary approaches were found to lead to invalid inference in many settings. Finally, although use of bias‐corrected empirical sandwich standard error estimates did not consistently result in nominal sizes, they did work well, thus supporting the applicability of marginal models in GRT settings. 相似文献
14.
Recurrent events are common in medical research for subjects who are followed for the duration of a study. For example, cardiovascular patients with an implantable cardioverter defibrillator (ICD) experience recurrent arrhythmic events that are terminated by shocks or antitachycardia pacing delivered by the device. In a published randomized clinical trial, a recurrent-event model was used to study the effect of a drug therapy in subjects with ICDs, who were experiencing recurrent symptomatic arrhythmic events. Under this model, one expects the robust variance for the estimated treatment effect to diminish when the duration of the trial is extended, due to the additional events observed. However, as shown in this article, that is not always the case. We investigate this phenomenon using large datasets from this arrhythmia trial and from a diabetes study, with some analytical results, as well as through simulations. Some insights are also provided on existing sample size formulae using our results. 相似文献
15.
P. Kumar S. K. Agarwal V. K. Mahajan 《Biometrical journal. Biometrische Zeitschrift》1983,25(3):269-274
A probability proportional to size (PPS) method of sample selection, based on the transformed auxiliary information as the measure of size, has been suggested. It has been observed that the PPS estimator under the suggested method is always better than the simple random sampling with replacement (SRSWR) and the usual PPSWR estimator. The efficiency of the proposed estimator with respect to the estimators under reference has also been empirically compared. 相似文献
16.
A comparison of the performance of five modelling methods using presence/absence (generalized additive models, discriminant analysis) or presence-only (genetic algorithm for rule-set prediction, ecological niche factor analysis, Gower distance) data for modelling the distribution of the tick species Boophilus decoloratus (Koch, 1844) (Acarina: Ixodidae) at a continental scale (Africa) using climate data was conducted. This work explicitly addressed the usefulness of clustering using the normalized difference vegetation index (NDVI) to split original records and build partial models for each region (cluster) as a method of improving model performance. Models without clustering have a consistently lower performance (as measured by sensitivity and area under the curve [AUC]), although presence/absence models perform better than presence-only models. Two cluster-related variables, namely, prevalence (commonness of tick records in the cluster) and marginality (the relative position of the climate niche occupied by the tick in relation to that available in the cluster) greatly affect the performance of each model (P < 0.05). Both sensitivity and AUC are better for NDVI-derived clusters where the tick is more prevalent or its marginality is low. However, the total size of the cluster or its fragmentation (measured by Shannon's evenness index) did not affect the performance of models. Models derived separately for each cluster produced the best output but resulted in a patchy distribution of predicted occurrence. The use of such a method together with weighting procedures based on prevalence and marginality as derived from populations at each cluster produced a slightly lower predictive performance but a better estimation of the continental distribution of the tick. Therefore, cluster-derived models are able to effectively capture restricting conditions for different tick populations at a regional level. It is concluded that data partitioning is a powerful method with which to describe the climate niche of populations of a tick species, as adapted to local conditions. The use of this methodology greatly improves the performance of climate suitability models. 相似文献
17.
Rosamarie Frieri William Fisher Rosenberger Nancy Flournoy Zhantao Lin 《Biometrics》2023,79(3):2565-2576
When there is a predictive biomarker, enrichment can focus the clinical trial on a benefiting subpopulation. We describe a two-stage enrichment design, in which the first stage is designed to efficiently estimate a threshold and the second stage is a “phase III-like” trial on the enriched population. The goal of this paper is to explore design issues: sample size in Stages 1 and 2, and re-estimation of the Stage 2 sample size following Stage 1. By treating these as separate trials, we can gain insight into how the predictive nature of the biomarker specifically impacts the sample size. We also show that failure to adequately estimate the threshold can have disastrous consequences in the second stage. While any bivariate model could be used, we assume a continuous outcome and continuous biomarker, described by a bivariate normal model. The correlation coefficient between the outcome and biomarker is the key to understanding the behavior of the design, both for predictive and prognostic biomarkers. Through a series of simulations we illustrate the impact of model misspecification, consequences of poor threshold estimation, and requisite sample sizes that depend on the predictive nature of the biomarker. Such insight should be helpful in understanding and designing enrichment trials. 相似文献
18.
Research examining the effects of electromagnetic fields (EMFs) on human performance and physiology has produced inconsistent results; this might be attributable to low statistical power. Statistical power refers to the probability of obtaining a statistically significant result, given the fact that a real effect exists. The results of a survey of published investigations of the effects of EMFs on human performance and physiology show that statistical power levels are very low, ranging from a mean of.08 for small effect sizes to .46 for large effect sizes. Implications of these findings for the interpretation of results are discussed along with suggestions for increasing statistical power. © 1996 Wiley-Liss, Inc. 相似文献
19.
20.
Models of species’ distributions and niches are frequently used to infer the importance of range- and niche-defining variables. However, the degree to which these models can reliably identify important variables and quantify their influence remains unknown. Here we use a series of simulations to explore how well models can 1) discriminate between variables with different influence and 2) calibrate the magnitude of influence relative to an ‘omniscient’ model. To quantify variable importance, we trained generalized additive models (GAMs), Maxent and boosted regression trees (BRTs) on simulated data and tested their sensitivity to permutations in each predictor. Importance was inferred by calculating the correlation between permuted and unpermuted predictions, and by comparing predictive accuracy of permuted and unpermuted predictions using AUC and the continuous Boyce index. In scenarios with one influential and one uninfluential variable, models failed to discriminate reliably between variables when training occurrences were < 8–64, prevalence was > 0.5, spatial extent was small, environmental data had coarse resolution and spatial autocorrelation was low, or when pairwise correlation between environmental variables was |r| > 0.7. When two variables influenced the distribution equally, importance was underestimated when species had narrow or intermediate niche breadth. Interactions between variables in how they shaped the niche did not affect inferences about their importance. When variables acted unequally, the effect of the stronger variable was overestimated. GAMs and Maxent discriminated between variables more reliably than BRTs, but no algorithm was consistently well-calibrated vis-à-vis the omniscient model. Algorithm-specific measures of importance like Maxent's change-in-gain metric were less robust than the permutation test. Overall, high predictive accuracy did not connote robust inferential capacity. As a result, requirements for reliably measuring variable importance are likely more stringent than for creating models with high predictive accuracy. 相似文献