首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

Microarrays permit biologists to simultaneously measure the mRNA abundance of thousands of genes. An important issue facing investigators planning microarray experiments is how to estimate the sample size required for good statistical power. What is the projected sample size or number of replicate chips needed to address the multiple hypotheses with acceptable accuracy? Statistical methods exist for calculating power based upon a single hypothesis, using estimates of the variability in data from pilot studies. There is, however, a need for methods to estimate power and/or required sample sizes in situations where multiple hypotheses are being tested, such as in microarray experiments. In addition, investigators frequently do not have pilot data to estimate the sample sizes required for microarray studies.  相似文献   

2.
Reducing the number of animal subjects used in biomedical experiments is desirable for ethical and practical reasons. Previous reviews of the benefits of reducing sample sizes have focused on improving experimental designs and methods of statistical analysis, but reducing the size of control groups has been considered rarely. We discuss how the number of current control animals can be reduced, without loss of statistical power, by incorporating information from historical controls, i.e. subjects used as controls in similar previous experiments. Using example data from published reports, we describe how to incorporate information from historical controls under a range of assumptions that might be made in biomedical experiments. Assuming more similarities between historical and current controls yields higher savings and allows the use of smaller current control groups. We conducted simulations, based on typical designs and sample sizes, to quantify how different assumptions about historical controls affect the power of statistical tests. We show that, under our simulation conditions, the number of current control subjects can be reduced by more than half by including historical controls in the analyses. In other experimental scenarios, control groups may be unnecessary. Paying attention to both the function and to the statistical requirements of control groups would result in reducing the total number of animals used in experiments, saving time, effort and money, and bringing research with animals within ethically acceptable bounds.  相似文献   

3.
T L Cucci  M E Sieracki 《Cytometry》2001,44(3):173-178
BACKGROUND: Forward-angle light scatter, as measured by flow cytometry, can be used to estimate the size spectra of cell assemblages from natural waters. The refractive index of water samples from aquatic environments can differ because of a variety of factors such as dissolved organic content, aldehyde preservative, sample salinity, and temperature. In flow cytometric analyses, mismatch between the refractive indices of the sheath fluid and the sample causes distortion of the forward-angle light scatter signal. We measured the effect of this mismatch on cell size measurements. METHODS: We examined the error by measuring the scatter signal of a variety of particle types and sizes and changing the sheath-to-sample salinity ratio. The effects were characterized for standard microspheres, cultured phytoplankton cells of different sizes, and natural populations from an estuarine river. RESULTS: We found that the distorted scatter signals resulted in an increase in the apparent size of small cells (1--2 microm) by a factor of 4.5 times. Cells in the size range of 3--5 microm were less affected by the salinity differences, and cells larger than 5 microm were not affected. Chlorophyll and phycoerythrin fluorescences and 90 degrees light scatter signals were not changed by sheath and sample salinity differences. CONCLUSIONS: Care must be taken to ensure that the sheath and sample refractive index are matched when using forward light scatter to measure cell size spectra, especially in estuarine studies, where salinity can vary greatly. Of the factors considered that can change the sample refractive index, salinity gradients in an estuary cause the largest index mismatch and, consequently, the largest error in scatter.  相似文献   

4.
Information on statistical power is critical when planning investigations and evaluating empirical data, but actual power estimates are rarely presented in population genetic studies. We used computer simulations to assess and evaluate power when testing for genetic differentiation at multiple loci through combining test statistics or P values obtained by four different statistical approaches, viz. Pearson's chi-square, the log-likelihood ratio G-test, Fisher's exact test, and an F(ST)-based permutation test. Factors considered in the comparisons include the number of samples, their size, and the number and type of genetic marker loci. It is shown that power for detecting divergence may be substantial for frequently used sample sizes and sets of markers, also at quite low levels of differentiation. The choice of statistical method may be critical, though. For multi-allelic loci such as microsatellites, combining exact P values using Fisher's method is robust and generally provides a high resolving power. In contrast, for few-allele loci (e.g. allozymes and single nucleotide polymorphisms) and when making pairwise sample comparisons, this approach may yield a remarkably low power. In such situations chi-square typically represents a better alternative. The G-test without Williams's correction frequently tends to provide an unduly high proportion of false significances, and results from this test should be interpreted with great care. Our results are not confined to population genetic analyses but applicable to contingency testing in general.  相似文献   

5.
Sample size has long been one of the basic issues since the start of the DNA barcoding initiative and the global biodiversity investigation. As a contribution to resolving this problem, we propose a simple resampling approach to estimate several key sampling sizes for a DNA barcoding project. We illustrate our approach using both structured populations simulated under coalescent and real species of skipper butterflies. We found that sample sizes widely used in DNA barcoding are insufficient to assess the genetic diversity of a species, population structure impacts the estimation of the sample sizes, and hence will bias the species identification potentially.  相似文献   

6.
We examined the relationship between meristem allocation and plant size for four annual plant species: Arabidopsis thaliana, Arenaria serphyllifolia, Brassica rapa, and Chaenorrhinum minus. Gradients of light and nutrient availability were used to obtain a range of plant sizes for each of these species. Relative allocation to reproductive, inactive, and growth meristems were used to measure reproductive effort, apical dominance, and branching intensity, respectively. We measured allocation to each of these three meristem fates at weekly intervals throughout development and at final developmental stage. At all developmental stages reproductive effort and branching intensity tended to increase with increasing plant size (i.e., due to increasing resource availability) and apical dominance tended to decrease with increasing plant size. We interpret these responses as a strategy for plants to maximize fitness across a range of environments. In addition, significant differences in meristem response among species may be important in defining the range of habitats in which a species can exist and may help explain patterns of species competition and coexistence in habitats with variable resource availability.  相似文献   

7.
The currently used criterion for sample size calculation in a reference interval study is not well stated and leads to imprecise control of the ratio in question. We propose a generalization of the criterion used to determine sufficient sample size in reference interval studies. The generalization allows better estimation of the required sample size when the reference interval estimation will be using a power transformation or is nonparametric. Bootstrap methods are presented to estimate sample sizes required by the generalized criterion. Simulation of several distributions both symmetric and positively skewed is presented to compare the sample size estimators. The new method is illustrated on a data set of plasma glucose values from a 50‐g oral glucose tolerance test. It is seen that the sample sizes calculated from the generalized criterion leads to more reliable control of the desired ratio.  相似文献   

8.
Aim Techniques that predict species potential distributions by combining observed occurrence records with environmental variables show much potential for application across a range of biogeographical analyses. Some of the most promising applications relate to species for which occurrence records are scarce, due to cryptic habits, locally restricted distributions or low sampling effort. However, the minimum sample sizes required to yield useful predictions remain difficult to determine. Here we developed and tested a novel jackknife validation approach to assess the ability to predict species occurrence when fewer than 25 occurrence records are available. Location Madagascar. Methods Models were developed and evaluated for 13 species of secretive leaf‐tailed geckos (Uroplatus spp.) that are endemic to Madagascar, for which available sample sizes range from 4 to 23 occurrence localities (at 1 km2 grid resolution). Predictions were based on 20 environmental data layers and were generated using two modelling approaches: a method based on the principle of maximum entropy (Maxent) and a genetic algorithm (GARP). Results We found high success rates and statistical significance in jackknife tests with sample sizes as low as five when the Maxent model was applied. Results for GARP at very low sample sizes (less than c. 10) were less good. When sample sizes were experimentally reduced for those species with the most records, variability among predictions using different combinations of localities demonstrated that models were greatly influenced by exactly which observations were included. Main conclusions We emphasize that models developed using this approach with small sample sizes should be interpreted as identifying regions that have similar environmental conditions to where the species is known to occur, and not as predicting actual limits to the range of a species. The jackknife validation approach proposed here enables assessment of the predictive ability of models built using very small sample sizes, although use of this test with larger sample sizes may lead to overoptimistic estimates of predictive power. Our analyses demonstrate that geographical predictions developed from small numbers of occurrence records may be of great value, for example in targeting field surveys to accelerate the discovery of unknown populations and species.  相似文献   

9.
Effects of sample size on the performance of species distribution models   总被引:8,自引:0,他引:8  
A wide range of modelling algorithms is used by ecologists, conservation practitioners, and others to predict species ranges from point locality data. Unfortunately, the amount of data available is limited for many taxa and regions, making it essential to quantify the sensitivity of these algorithms to sample size. This is the first study to address this need by rigorously evaluating a broad suite of algorithms with independent presence–absence data from multiple species and regions. We evaluated predictions from 12 algorithms for 46 species (from six different regions of the world) at three sample sizes (100, 30, and 10 records). We used data from natural history collections to run the models, and evaluated the quality of model predictions with area under the receiver operating characteristic curve (AUC). With decreasing sample size, model accuracy decreased and variability increased across species and between models. Novel modelling methods that incorporate both interactions between predictor variables and complex response shapes (i.e. GBM, MARS-INT, BRUTO) performed better than most methods at large sample sizes but not at the smallest sample sizes. Other algorithms were much less sensitive to sample size, including an algorithm based on maximum entropy (MAXENT) that had among the best predictive power across all sample sizes. Relative to other algorithms, a distance metric algorithm (DOMAIN) and a genetic algorithm (OM-GARP) had intermediate performance at the largest sample size and among the best performance at the lowest sample size. No algorithm predicted consistently well with small sample size ( n  < 30) and this should encourage highly conservative use of predictions based on small sample size and restrict their use to exploratory modelling.  相似文献   

10.
Videotaping is currently recognized as the most reliable method of predator identification at active bird nests but it is relatively expensive and labour intensive. While the number of published studies has increased over the past 10 years, the mean sample size is not increasing. Thirty‐one case studies (n>5 events) reported 6–70 (median=22) predation events by 2–14 (6) species of predators. The number of predator species increased with a 0.50±0.09 (SE) power of sample size across studies (0.54±0.06 with the present study included). This relationship was consistent across single‐ and multi‐species studies and corresponded well with that found within the present 5‐year study (176 events, 20 species) where neither the annual nor the pooled‐sample species accumulation curve reached an obvious asymptote. The species accumulation curve was smooth and fell within the confidence limits of the rarefaction curve over the entire range of sample sizes, suggesting homogeneous sampling. In 31 case studies the dominant predator accounted for 21–96% (38%) of total predation and this proportion did not correlate with the sample size across studies. In this study the observed proportion fluctuated widely until the cumulative sample size reached about 50 records and stabilized thereafter at the final value of 37%. Because the regional pool of potential nest predators is usually high, a complete enumeration of the local predator community is difficult with an acceptable nest monitoring effort. Correct identification of the dominant predators is likely even with small samples, but quantification of their share is uncertain when based on <50 records. Researchers are encouraged to increase their sampling effort above the current level and to consider contingency of results upon sample size.  相似文献   

11.
In this study, we investigated the theoretical potential of size exclusion chromatography (SEC) for resolving mixtures of protein aggregates (of various sizes and shapes) produced in the generation of amyloid fibrils. We present our findings in the form of an equilibrium partition model. We first review the general characteristics of SEC and discuss the physicochemical features affecting solute transport and partition. We then develop new methods for estimating the transport and partition coefficients of protein aggregates on the basis of their molecular dimensions and the SEC column properties. We detail how these calculated properties can be used to estimate the likely resolving power of an SEC column. Model predictions were found to be in general agreement with experimental data gained from the measurement of the elution profile of sheared amyloid fibrils prepared from bovine insulin and passed through a Superose 6 precision SEC column. Our formalism should provide a basic appreciation of the competing factors at work and allow an informed choice to be made for optimal selection of SEC column medium to separate a desired size range of aggregate.  相似文献   

12.
Data from published sources were used to compare the numbers of different electrophoretic alleles of 29 monomeric and dimeric human enzymes to their respective subunit molecular weights. Only those human enzymes were considered for which the total sample sizes were in excess of 2000 individuals. Correlations between these two variables were determined within sample size ranges of 2000≤n≤3000 and 4000≤n≤5000 individuals, and separately by quaternary class. There was no statistically significant correlation observed for the smaller sample size range in monomers; however, the correlations for the larger sample size range in monomers and both ranges in dimers were significant. Since there is no relationship between subunit size and heterozygosity, the relationships are due primarily to the incidence of rare alleles. These findings demonstrate the effect of locus-specific mutation rates, expected as a consequence of variation of cistron sizes, and imply that other forces are responsible for the relative frequencies of common alleles at some of the loci.  相似文献   

13.
Jain et al. introduced the Local Pooled Error (LPE) statistical test designed for use with small sample size microarray gene-expression data. Based on an asymptotic proof, the test multiplicatively adjusts the standard error for a test of differences between two classes of observations by pi/2 due to the use of medians rather than means as measures of central tendency. The adjustment is upwardly biased at small sample sizes, however, producing fewer than expected small P-values with a consequent loss of statistical power. We present an empirical correction to the adjustment factor which removes the bias and produces theoretically expected P-values when distributional assumptions are met. Our adjusted LPE measure should prove useful to ongoing methodological studies designed to improve the LPE's; performance for microarray and proteomics applications and for future work for other high-throughput biotechnologies. AVAILABILITY: The software is implemented in the R language and can be downloaded from the Bioconductor project website (http://www.bioconductor.org).  相似文献   

14.
Randomly distributed or “fluctuating” dental asymmetry has been accorded evolutionary meaning and interpreted as a result of environmental stress. However, except for congenital malformation syndromes, the determinants of human crown size asymmetry are still equivocal. Both a computer simulated sampling experiment using a combined sample size of N = 3000, and the requirements of adequate statistical power show that sample sizes of several hundred are needed to detect population differences in dental asymmetry. Using the largest available sample of children with defined prenatal stresses, we are unable to find systematic increases in crown size asymmetry. Given sampling limitations and the current inability to link increased human dental asymmetry to defined prenatal stresses, we suggest that fluctuating dental asymmetry is not yet established as a useful and reliable measure of general stress in human populations.  相似文献   

15.
Neonatal seizures are common in the neonatal intensive care unit. Clinicians treat these seizures with several anti-epileptic drugs (AEDs) to reduce seizures in a neonate. Current AEDs exhibit sub-optimal efficacy and several randomized control trials (RCT) of novel AEDs are planned. The aim of this study was to measure the influence of trial design on the required sample size of a RCT. We used seizure time courses from 41 term neonates with hypoxic ischaemic encephalopathy to build seizure treatment trial simulations. We used five outcome measures, three AED protocols, eight treatment delays from seizure onset (Td) and four levels of trial AED efficacy to simulate different RCTs. We performed power calculations for each RCT design and analysed the resultant sample size. We also assessed the rate of false positives, or placebo effect, in typical uncontrolled studies. We found that the false positive rate ranged from 5 to 85% of patients depending on RCT design. For controlled trials, the choice of outcome measure had the largest effect on sample size with median differences of 30.7 fold (IQR: 13.7–40.0) across a range of AED protocols, Td and trial AED efficacy (p<0.001). RCTs that compared the trial AED with positive controls required sample sizes with a median fold increase of 3.2 (IQR: 1.9–11.9; p<0.001). Delays in AED administration from seizure onset also increased the required sample size 2.1 fold (IQR: 1.7–2.9; p<0.001). Subgroup analysis showed that RCTs in neonates treated with hypothermia required a median fold increase in sample size of 2.6 (IQR: 2.4–3.0) compared to trials in normothermic neonates (p<0.001). These results show that RCT design has a profound influence on the required sample size. Trials that use a control group, appropriate outcome measure, and control for differences in Td between groups in analysis will be valid and minimise sample size.  相似文献   

16.
Matsui S  Noma H 《Biometrics》2011,67(4):1225-1235
Summary In microarray screening for differentially expressed genes using multiple testing, assessment of power or sample size is of particular importance to ensure that few relevant genes are removed from further consideration prematurely. In this assessment, adequate estimation of the effect sizes of differentially expressed genes is crucial because of its substantial impact on power and sample‐size estimates. However, conventional methods using top genes with largest observed effect sizes would be subject to overestimation due to random variation. In this article, we propose a simple estimation method based on hierarchical mixture models with a nonparametric prior distribution to accommodate random variation and possible large diversity of effect sizes across differential genes, separated from nuisance, nondifferential genes. Based on empirical Bayes estimates of effect sizes, the power and false discovery rate (FDR) can be estimated to monitor them simultaneously in gene screening. We also propose a power index that concerns selection of top genes with largest effect sizes, called partial power. This new power index could provide a practical compromise for the difficulty in achieving high levels of usual overall power as confronted in many microarray experiments. Applications to two real datasets from cancer clinical studies are provided.  相似文献   

17.
18.
We summarize characteristic sequences of morphological change in the teleost visual system from larvae to large adults at the level of the retina, the optic tract and the optic tectum. These shifts include sizes and ratios of cone and rod receptor cells, sizes and types of retinal ganglion cells and optic tract fibers as well as features of the optic tectum. Teleost larvae are the smallest vertebrates known. We suggest that the utilization of color contrasts as an adaptive benefit dictates the starting point of morophological development, which is a pure cone retina in most fish larvae. The direction of morphological and functional shifts in the teleost visual system during growth is determined by continuous retinal stretch, which allows for improving visual abilities. The larval visual system probably provides just adequate photopic (cone-)acuity for plankton feeding, but limited space in the retina hampers optimization of both, photopic resolving power and sensitivity Limited space also Irevents the simultaneous development of the scotopic (rod-)system. Over a wide range of body sizes, morphological parameters change, photopic and scotopic resolving power, acuity and sensitivity improve. Size constraints in the teleost visual system and lifefong shifts in sensory capacities are discussed with respect to ecology and the niche concept.  相似文献   

19.
Although phylogenetic hypotheses can provide insights into mechanisms of evolution, their utility is limited by our inability to differentiate simultaneous speciation events (hard polytomies) from rapid cladogenesis (soft polytomies). In the present paper, we tested the potential for statistical power analysis to differentiate between hard and soft polytomies in molecular phytogenies. Classical power analysis typically is used a priori to determine the sample size required to detect a particular effect size at a particular level of significance (a) with a certain power (1 – β). A posteriori, power analysis is used to infer whether failure to reject a null hypothesis results from lack of an effect or from insufficient data (i.e., low power). We adapted this approach to molecular data to infer whether polytomies result from simultaneous branching events or from insufficient sequence information. We then used this approach to determine the amount of sequence data (sample size) required to detect a positive branch length (effect size). A worked example is provided based on the auklets (Charadriiformes: Alcidae), a group of seabirds among which relationships are represented by a polytomy, despite analyses of over 3000 bp of sequence data. We demonstrate the calculation of effect sizes and sample sizes from sequence data using a normal curve test for difference of a proportion from an expected value and a t-test for a difference of a mean from an expected value. Power analyses indicated that the data for the auklets should be sufficient to differentiate speciation events that occurred at least 100,000 yr apart (the duration of the shortest glacial and interglacial events of the Pleistocene), 2.6 million years ago.  相似文献   

20.
Li Z  Gail MH  Pee D  Gastwirth JL 《Human heredity》2002,53(3):114-129
Risch and Teng [Genome Res 1998;8:1273-1288] and Teng and Risch [Genome Res 1999;9:234-241] proposed a class of transmission/disequilibrium test-like statistical tests based on the difference between the estimated allele frequencies in the affected and control populations. They evaluated the power of a variety of family-based and nonfamily-based designs for detecting an association between a candidate allele and disease. Because they were concerned with diseases with low penetrances, their power calculations assumed that unaffected individuals can be treated as a random sample from the population. They predicted that this assumption rendered their sample size calculations slightly conservative. We generalize their partial ascertainment conditioning by including the status of the unaffected sibs in the calculations of the distribution and power of the statistic used to compare the allele frequency in affected offspring to the estimated frequency in the parents, based on sibships with genotyped affected and unaffected sibs. Sample size formulas for our full ascertainment methods are presented. The sample sizes for our procedure are compared to those of Teng and Risch. The numerical results and simulations indicate that the simplifying assumption used in Teng and Risch can produce both conservative and anticonservative results. The magnitude of the difference between the sample sizes needed by their partial ascertainment approximation and the full ascertainment is small in the circumstances they focused on but can be appreciable in others, especially when the baseline penetrances are moderate. Two other statistics, using different estimators for the variance of the basic statistic comparing the allele frequencies in the affected and unaffected sibs are introduced. One of them incorporates an estimate of the null variance obtained from an auxiliary sample and appears to noticeably decrease the sample sizes required to achieve a prespecified power.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号