首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Investigations of sample size for planning case-control studies have usually been limited to detecting a single factor. In this paper, we investigate sample size for multiple risk factors in strata-matched case-control studies. We construct an omnibus statistic for testing M different risk factors based on the jointly sufficient statistics of parameters associated with the risk factors. The statistic is non-iterative, and it reduces to the Cochran statistic when M = 1. The asymptotic power function of the test is a non-central chi-square with M degrees of freedom and the sample size required for a specific power can be obtained by the inverse relationship. We find that the equal sample allocation is optimum. A Monte Carlo experiment demonstrates that an approximate formula for calculating sample size is satisfactory in typical epidemiologic studies. An approximate sample size obtained using Bonferroni's method for multiple comparisons is much larger than that obtained using the omnibus test. Approximate sample size formulas investigated in this paper using the omnibus test, as well as the individual tests, can be useful in designing case-control studies for detecting multiple risk factors.  相似文献   

2.
We are concerned with calculating the sample size required for estimating the mean of the continuous distribution in the context of a two component nonstandard mixture distribution (i.e., a mixture of an identifiable point degenerate function F at a constant with probability P and a continuous distribution G with probability 1 – P). A common ad hoc procedure of escalating the naïve sample size n (calculated under the assumption of no point degenerate function F) by a factor of 1/(1 – P), has about 0.5 probability of achieving the pre‐specified statistical power. Such an ad hoc approach may seriously underestimate the necessary sample size and jeopardize inferences in scientific investigations. We argue that sample size calculations in this context should have a pre‐specified probability of power ≥1 – β set by the researcher at a level greater than 0.5. To that end, we propose an exact method and an approximate method to calculate sample size in this context so that the pre‐specified probability of achieving a desired statistical power is determined by the researcher. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

3.
NOETHER (1987) proposed a method of sample size determination for the Wilcoxon-Mann-Whitney test. To obtain a sample size formula, he restricted himself to alternatives that differ only slightly from the null hypothesis, so that the unknown variance o2 of the Mann-Whitney statistic can be approximated by the known variance under the null hypothesis which depends only on n. This fact is frequently forgotten in statistical practice. In this paper, we compare Noether's large sample solution against an alternative approach based on upper bounds of σ2 which is valid for any alternatives. This comparison shows that Noether's approximation is sufficiently reliable with small and large deviations from the null hypothesis.  相似文献   

4.
Clinical trials with adaptive sample size re-assessment, based on an analysis of the unblinded interim results (ubSSR), have gained in popularity due to uncertainty regarding the value of \(\delta \) at which to power the trial at the start of the study. While the statistical methodology for controlling the type-1 error of such designs is well established, there remain concerns that conventional group sequential designs with no ubSSR can accomplish the same goals with greater efficiency. The precise manner in which this efficiency comparison can be objectified has been difficult to quantify, however. In this paper, we present a methodology for making this comparison in a standard, well-accepted manner by plotting the unconditional power curves of the two approaches while holding constant their expected sample size, at each value of \(\delta \) in the range of interest. It is seen that under reasonable decision rules for increasing sample size (conservative promising zones, and no more than a 50% increase in sample size) there is little or no loss of efficiency for the adaptive designs in terms of unconditional power. The two approaches, however, have very different conditional power profiles. More generally, a methodology has been provided for comparing any design with ubSSR relative to a comparable group sequential design with no ubSSR, so one can determine whether the efficiency loss, if any, of the ubSSR design is offset by the advantages it confers for re-powering the study at the time of the interim analysis.  相似文献   

5.
Summary As the nonparametric generalization of the one‐way analysis of variance model, the Kruskal–Wallis test applies when the goal is to test the difference between multiple samples and the underlying population distributions are nonnormal or unknown. Although the Kruskal–Wallis test has been widely used for data analysis, power and sample size methods for this test have been investigated to a much lesser extent. This article proposes new power and sample size calculation methods for the Kruskal–Wallis test based on the pilot study in either a completely nonparametric model or a semiparametric location model. No assumption is made on the shape of the underlying population distributions. Simulation results show that, in terms of sample size calculation for the Kruskal–Wallis test, the proposed methods are more reliable and preferable to some more traditional methods. A mouse peritoneal cavity study is used to demonstrate the application of the methods.  相似文献   

6.
R J Connor 《Biometrics》1987,43(1):207-211
Miettinen (1968, Biometrics 24, 339-352) presented an approximation for power and sample size for testing the differences between proportions in the matched-pair case. Duffy (1984, Biometrics 40, 1005-1015) gave the exact power for this case and showed that Miettinen's approximation tends to slightly overestimate the power or underestimate the sample size necessary for the design power. A simple alternative approximation that is more conservative is presented here. In many cases, the sample size for the independent-sample case provides a conservative approximation for the matched-pair design.  相似文献   

7.
COCHRAN (1953) and BARTCH (1957) gave formulae for the magnitude of the sample size (n) ensuring the validity of the limiting normal distribution of the sample mean x(n) obtained from a non-normal distribution with marked asymmetry and kurtosis. These formulae have been checked empirically in this paper using (a) simulated data with given asymmetry and kurtosis and (b) real data gathered from a coronary heart disease study. We find that our results are in general agreement with Bartch's formula. However, in a number of cases, the asymptotic normal distribution is attained for smaller sample size than that required by Bartch's formula.  相似文献   

8.
This paper outlines methods of determining sample size for epidemiologic research in studies of the etiologic fraction. The basic model with a dichotomous disease and a single dichotomous exposure factor is considered. To determine sample size, the researcher must specify: the magnitude of the etiologic fraction ε to be detected as statistically significant, the level of significance α, the power 1 - β of the test, p the proportion of the population exposed to the risk factor and R the proportion of the population with the disease. Sample size formulas and tables are presented for the case-control, cohort and cross-sectional designs. Optimal allocation considerations are examined to minimize cost for a specified power. Extensive use is made of Walter's results concerning the asymptotic variance of the maximum likelihood estimator of the etiologic fraction for the three epidemiologic study designs.  相似文献   

9.
Biomarker-directed targeted clinical trial is aimed at developing pharmaceutical agents for a targeted patient subpopulation sharing a specific disease etiology. Biomarker plays a key role in patient enrichment for targeted trials. Biomarker performance substantially impacts heterogeneity of a targeted study population and consequently trial efficiency, statistical power, information accumulation, and early stopping decision-making (Simon and Maitournam in Clinical Cancer Res 10:6759-6763, 2004; Maitournam and Simon in Stat Med 24:329-339, 2005; Gao et al. in Contemp Clin Trials 42:119-131, 2015). Hence, accurate assessment of biomarker performance is crucial to sample size calculation in planning of targeted trials. However, prior knowledge of biomarker performance is often limited at the planning stage due to inadequacy of biomarker validation, differences between study populations in demographic characteristics and trial settings, etc. Under this circumstance, adaptive design would be useful in updating biomarker performance and re-estimating sample sizes when a targeted trial is ongoing. In this paper, we propose a two-stage adaptive design that provides flexibility in biomarker performance-based sample size adaption for targeted trials. The design can facilitate a targeted trial to achieve planned statistical power by re-assessment of actual biomarker performance and subsequent sample size adaption while preserving desired type-1 error.  相似文献   

10.
Statistical criterion for evaluation of individual bioequivalence (IBE) between generic and innovative products often involves a function of the second moments of normal distributions. Under replicated crossover designs, the aggregate criterion for IBE proposed by the guidance of the U.S. Food and Drug Administration (FDA) contains the squared mean difference, variance of subject-by-formulation interaction, and the difference in within-subject variances between the generic and innovative products. The upper confidence bound for the linearized form of the criterion derived by the modified large sample (MLS) method is proposed in the 2001 U.S. FDA guidance as a testing procedure for evaluation of IBE. Due to the complexity of the power function for the criterion based on the second moments, literature on sample size determination for the inference of IBE is scarce. Under the two-sequence and four-period crossover design, we derive the asymptotic distribution of the upper confidence bound of the linearized criterion. Hence the asymptotic power can be derived for sample size determination for evaluation of IBE. Results of numerical studies are reported. Discussion of sample size determination for evaluation of IBE based on the aggregate criterion of the second moments in practical applications is provided.  相似文献   

11.
We consider sample size determination for ordered categorical data when the alternative assumption is the proportional odds model. In this paper the sample size formula proposed by Whitehead (Statistics in Medicine, 12 , 2257–2271, 1993) is compared with the methods based on exact and asymptotic linear rank tests with Wilcoxon and trend scores. We show that Whitehead's formula, which is based on a normal approximation, works well when the sample size is moderate to large but recommend the exact method with Wilcoxon scores for small sample sizes. The consequences of misspecification in models are also investigated.  相似文献   

12.
We discuss here the influence of sample size (number of replicates) on the accuracy and precision of the results when sampling profundal benthos with an Ekman grab according to the Finnish standard, SFS 5076, which is equivalent to the Swedish and Norwegian standards. The aim was to find criteria for choosing a sample size which would avoid any powerful influence of chance on the results without entailing an unreasonable amount of work for monitoring purposes.Lake Haukivesi (area 620 km2, total phosphorus 13 µg l–1 and colour 35 Pt mg l–1), Lake Paasivesi (116 km2, 5 µg l–1 and 35 Pt mg l–1) and Lake Puruvesi (322 km2, 4 µg l–1 and 5 Pt mg l–1) were sampled randomly in June and October 1991. 25 Replicate samples were taken on each occasion from the deep profundal area of each lake, defined here as 60-100% of the maximum depth. The sedimentation areas studied were fairly homogeneous, since the animal communities were not markedly affected by the variations in depth. Distribution estimates for the statistics studied, such as number of individuals, expected number of species, diversity and benthic quality indices, were calculated for a large set of random samples taken from the empirical data by computer (bootstrap sampling). The sample variance, s 2, correlated with the mean animal density, m (ind. m–2), according to the equation s 2 = 31.77 m 1.247. The sample size required to achieve the desired precision in mean animal density (D, expressed as the ratio standard error/mean) can thus be estimated as n = 31.77 m –0.753 D –2. The number of replicate samples needed to achieve a standard error of 20% of the mean density was 10 in Lake Haukivesi, seven in Lake Paasivesi and 11 in Lake Puruvesi. The accuracy and precision of the estimated number of species, Shannon's diversity and Benthic Quality Index improved markedly as the sample size was increased to 10 replicates. As a compromise between work load and statistical reliability, a figure of 10 replicate Ekman samples is proposed here for the monitoring of profundal benthos. The proposed sample size usually produces individual numbers which are high enough for practical purposes, probably at least 100 individuals, which is recommended as a minimum in the standard. The lower number of replicate samples recommended in recent Finnish handbook, 3–5, usually produces inadequate data, and this may detract from the comparability of the results and leave the changes in profundal communities undetected.  相似文献   

13.
The field of ecological restoration is growing rapidly, and the sourcing of suitable seed is a major issue. Information on the population genetic structure of a species can provide valuable information to aid in defining seed collection zones. For a practical contribution from genetics, a rapid approach to delineating seed collection zones using genetic markers (amplified fragment length polymorphisms [AFLPs]) has been developed. Here, we test the effects of sampling regime on the efficacy of this method. Genetic data were collected for an outcrossing seeder, Daviesia divaricata ssp. divaricata, an important species in urban bushland restoration in Perth, Western Australia. The effect of sample size and number of AFLP markers on estimates of genetic variation and population structure was examined in relation to implications for sourcing material for restoration. Three different sample sizes were used (n= 8, 15, and 30) from six urban bushland remnants. High levels of genetic diversity were observed in D. divaricata (87.4% polymorphic markers), with significant population differentiation detected among sampled populations (ΘB= 0.1386, p < 0.001). Although sample size does not appear to affect the spatial pattern in principle co‐ordinates analysis (PCA) plots, the number of polymorphic loci increased with sample size and estimates of population subdivision (FST and ΘB) and associated confidence intervals decreased with increasing sample size. We recommend using a minimum of 30 plants for sourcing seed for restoration projects.  相似文献   

14.
MOTIVATION: There is not a widely applicable method to determine the sample size for experiments basing statistical significance on the false discovery rate (FDR). RESULTS: We propose and develop the anticipated FDR (aFDR) as a conceptual tool for determining sample size. We derive mathematical expressions for the aFDR and anticipated average statistical power. These expressions are used to develop a general algorithm to determine sample size. We provide specific details on how to implement the algorithm for a k-group (k > or = 2) comparisons. The algorithm performs well for k-group comparisons in a series of traditional simulations and in a real-data simulation conducted by resampling from a large, publicly available dataset. AVAILABILITY: Documented S-plus and R code libraries are freely available from www.stjuderesearch.org/depts/biostats.  相似文献   

15.

Background  

Before conducting a microarray experiment, one important issue that needs to be determined is the number of arrays required in order to have adequate power to identify differentially expressed genes. This paper discusses some crucial issues in the problem formulation, parameter specifications, and approaches that are commonly proposed for sample size estimation in microarray experiments. Common methods for sample size estimation are formulated as the minimum sample size necessary to achieve a specified sensitivity (proportion of detected truly differentially expressed genes) on average at a specified false discovery rate (FDR) level and specified expected proportion (π 1) of the true differentially expression genes in the array. Unfortunately, the probability of detecting the specified sensitivity in such a formulation can be low. We formulate the sample size problem as the number of arrays needed to achieve a specified sensitivity with 95% probability at the specified significance level. A permutation method using a small pilot dataset to estimate sample size is proposed. This method accounts for correlation and effect size heterogeneity among genes.  相似文献   

16.
Numerous initiatives are underway throughout New England and elsewhere to quantify salt marsh vegetation change, mostly in response to habitat restoration, sea level rise, and nutrient enrichment. To detect temporal changes in vegetation at a marsh or to compare vegetation among different marshes with a degree of statistical certainty an adequate sample size is required. Based on sampling 1 m2 vegetation plots from 11 New England salt marsh data sets, we conducted a power analysis to determine the minimum number of samples that were necessary to detect change between vegetation communities. Statistical power was determined for sample sizes of 5, 10, 15, and 20 vegetation plots at an alpha level of 0.05. Detection of subtle differences between vegetation data sets (e.g., comparing vegetation in the same marsh over two consecutive years) can be accomplished using a sample size of 20 plots with a reasonable probability of detecting a difference when one truly exists. With a lower sample size, and thus lower power, there is an increased probability of not detecting a difference when one exists (e.g., Type II error). However, if investigators expect to detect major changes in vegetation (e.g., such as those between an un-impacted and a highly impacted marsh) then a sample size of 5, 10, or 15 plots may be appropriate while still maintaining adequate power. Due to the relative ease of collecting vegetation data, we suggest a minimum sample size of 20 randomly located 1 m2 plots when developing monitoring designs to detect vegetation community change of salt marshes. The sample size of 20 plots per New England salt marsh is appropriate regardless of marsh size or permanency (permanent or non-permanent) of the plots.  相似文献   

17.

Background  

Microarrays permit biologists to simultaneously measure the mRNA abundance of thousands of genes. An important issue facing investigators planning microarray experiments is how to estimate the sample size required for good statistical power. What is the projected sample size or number of replicate chips needed to address the multiple hypotheses with acceptable accuracy? Statistical methods exist for calculating power based upon a single hypothesis, using estimates of the variability in data from pilot studies. There is, however, a need for methods to estimate power and/or required sample sizes in situations where multiple hypotheses are being tested, such as in microarray experiments. In addition, investigators frequently do not have pilot data to estimate the sample sizes required for microarray studies.  相似文献   

18.
In this paper (1) expressions (correct to n?2 terms) for biases, variances, and covariances of the estimators a and b of Hermite distribution with probability generating function Exp[a(t–1) + b(t–1)] are obtained for two mixed moment estimates; (2) for the biases and variance-covariances, approximate regions of the parameter space (a>0, b>0) have been outlined where a sample of size 100 can be considered as “safe” in the sense that contribution of second order terms in them is 5% of that from the first order term; (3) comparison of the biases and variance-covariances of these two sets of estimators are made with those for the moment estimators, maximum likelihood estimates and the even point estimators for a sample of size 100 using the terms up to order n?2; (4) the comparisons based on n?2 terms in (3) have not only provided information on the estimation procedures included in the Hermite distribution, but also demonstrated the importance of higher order terms in the sampling properties of the various alternative techniques for the Hermite distribution.  相似文献   

19.
Mass spectrometric profiling approaches such as MALDI‐TOF and SELDI‐TOF are increasingly being used in disease marker discovery, particularly in the lower molecular weight proteome. However, little consideration has been given to the issue of sample size in experimental design. The aim of this study was to develop a protocol for the use of sample size calculations in proteomic profiling studies using MS. These sample size calculations can be based on a simple linear mixed model which allows the inclusion of estimates of biological and technical variation inherent in the experiment. The use of a pilot experiment to estimate these components of variance is investigated and is shown to work well when compared with larger studies. Examination of data from a number of studies using different sample types and different chromatographic surfaces shows the need for sample‐ and preparation‐specific sample size calculations.  相似文献   

20.
MOTIVATION: Microarray experiments often involve hundreds or thousands of genes. In a typical experiment, only a fraction of genes are expected to be differentially expressed; in addition, the measured intensities among different genes may be correlated. Depending on the experimental objectives, sample size calculations can be based on one of the three specified measures: sensitivity, true discovery and accuracy rates. The sample size problem is formulated as: the number of arrays needed in order to achieve the desired fraction of the specified measure at the desired family-wise power at the given type I error and (standardized) effect size. RESULTS: We present a general approach for estimating sample size under independent and equally correlated models using binomial and beta-binomial models, respectively. The sample sizes needed for a two-sample z-test are computed; the computed theoretical numbers agree well with the Monte Carlo simulation results. But, under more general correlation structures, the beta-binomial model can underestimate the needed samples by about 1-5 arrays. CONTACT: jchen@nctr.fda.gov.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号