首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
There is over 60 years of discussion in the statistical literature concerning the misuse and limitations of null hypothesis significance tests (NHST). Based on the prevalence of NHST in biological anthropology research, it appears that the discipline generally is unaware of these concerns. The p values used in NHST usually are interpreted incorrectly. A p value indicates the probability of the data given the null hypothesis. It should not be interpreted as the probability that the null hypothesis is true or as evidence for or against any specific alternative to the null hypothesis. P values are a function of both the sample size and the effect size, and therefore do not indicate whether the effect observed in the study is important, large, or small. P values have poor replicability in repeated experiments. The distribution of p values is continuous and varies from 0 to 1.0. The use of a cut‐off, generally p ≤ 0.05, to separate significant from nonsignificant results, is an arbitrary dichotomization of continuous variation. In 2016, the American Statistical Association issued a statement of principles regarding the misinterpretation of NHST, the first time it has done so regarding a specific statistical procedure in its 180‐year history. Effect sizes and confidence intervals, which can be calculated for any data used to calculate p values, provide more and better information about tested hypotheses than p values and NHST.  相似文献   

2.
Statistically nonsignificant (p > .05) results from a null hypothesis significance test (NHST) are often mistakenly interpreted as evidence that the null hypothesis is true—that there is “no effect” or “no difference.” However, many of these results occur because the study had low statistical power to detect an effect. Power below 50% is common, in which case a result of no statistical significance is more likely to be incorrect than correct. The inference of “no effect” is not valid even if power is high. NHST assumes that the null hypothesis is true; p is the probability of the data under the assumption that there is no effect. A statistical test cannot confirm what it assumes. These incorrect statistical inferences could be eliminated if decisions based on p values were replaced by a biological evaluation of effect sizes and their confidence intervals. For a single study, the observed effect size is the best estimate of the population effect size, regardless of the p value. Unlike p values, confidence intervals provide information about the precision of the observed effect. In the biomedical and pharmacology literature, methods have been developed to evaluate whether effects are “equivalent,” rather than zero, as tested with NHST. These methods could be used by biological anthropologists to evaluate the presence or absence of meaningful biological effects. Most of what appears to be known about no difference or no effect between sexes, between populations, between treatments, and other circumstances in the biological anthropology literature is based on invalid statistical inference.  相似文献   

3.
Recent reviews of specific topics, such as the relationship between male attractiveness to females and fluctuating asymmetry or attractiveness and the expression of secondary sexual characters, suggest that publication bias might be a problem in ecology and evolution. In these cases, there is a significant negative correlation between the sample size of published studies and the magnitude or strength of the research findings (formally the ‘effect size’). If all studies that are conducted are equally likely to be published, irrespective of their findings, there should not be a directional relationship between effect size and sample size; only a decrease in the variance in effect size as sample size increases due to a reduction in sampling error. One interpretation of these reports of negative correlations is that studies with small sample sizes and weaker findings (smaller effect sizes) are less likely to be published. If the biological literature is systematically biased this could undermine the attempts of reviewers to summarise actual biology relationships by inflating estimates of average effect sizes. But how common is this problem? And does it really effect the general conclusions of literature reviews? Here, we examine data sets of effect sizes extracted from 40 peer‐reviewed, published meta‐analyses. We estimate how many studies are missing using the newly developed ‘trim and fill’ method. This method uses asymmetry in plots of effect size against sample size (‘funnel plots’) to detect ‘missing’ studies. For random‐effect models of meta‐analysis 38% (15/40) of data sets had a significant number of ‘missing’ studies. After correcting for potential publication bias, 21% (8/38) of weighted mean effects were no longer significantly greater than zero, and 15% (5/34) were no longer statistically robust when we used random‐effects models in a weighted meta‐analysis. The mean correlation between sample size and the magnitude of standardised effect size was also significantly negative (rs=‐0.20, P < 0‐0001). Individual correlations were significantly negative (P < 0.10) in 35% (14/40) of cases. Publication bias may therefore effect the main conclusions of at least 15–21% of meta‐analyses. We suggest that future literature reviews assess the robustness of their main conclusions by correcting for potential publication bias using the ‘trim and fill’ method.  相似文献   

4.
Quantifying signal repertoire size is a critical first step towards understanding the evolution of signal complexity. However, counting signal types can be so complicated and time consuming when repertoire size is large, that this trait is often estimated rather than measured directly. We studied how three common methods for repertoire size quantification (i.e., simple enumeration, curve‐fitting and capture‐recapture analysis) are affected by sample size and presentation style using simulated repertoires of known sizes. As expected, estimation error decreased with increasing sample size and varied among presentation styles. More surprisingly, for all but one of the presentation styles studied, curve‐fitting and capture–recapture analysis yielded errors of similar or greater magnitude than the errors researchers would make by simply assuming that the number of types in an incomplete sample is the true repertoire size. Our results also indicate that studies based on incomplete samples are likely to yield incorrect ranking of individuals and spurious correlations with other parameters regardless of the technique of choice. Finally, we argue that biological receivers face similar difficulties in quantifying repertoire size than human observers and we explore some of the biological implications of this hypothesis.  相似文献   

5.
整合分析中结合效应值和总异质性的介绍   总被引:2,自引:1,他引:1  
郑凤英  彭少麟 《生态科学》2004,23(3):249-252
整合分析(meta-analysis)是对同一主题下多个独立实验结果进行综合的统计学方法,被认为是到目前为止最好的数量综合方法,其统计量为效应值。结合效应值(cumulative effect size)和总异质性(total heterogeneity)分别是整合分析中描述效应值中心趋向和变异程度的两个指标,是在整合分析中最重要的两个参数。在整合分析中随数据结构的不同又有多种求结合效应值和总异质性的方法。介绍了与三种数据结构(无结构数据、分类数据、连续数据)相对应的这两个指标的计算方法。  相似文献   

6.
7.
Despite the increased use of dry active Saccharomyces cerevisiae yeast supplementation in ruminant feeding, inconsistent results among studies hamper the prediction of its effects on animal performance. A meta-analysis has been conducted to quantify the magnitude of yeast supplementation effects on ruminal parameters, total tract nutrient digestibility, growth and feed conversion across different studies with sheep. Different methodologies and small numbers of studies necessitated the use of the classical effect size method, in which a unitless standardised effect size (Hedges's g) was used to calculate differences obtained in outcomes between supplemented and non-supplemented sheep. Summary statistics across studies were calculated with fixed and random effects models, whereas subgroup-analysis and meta-regression were applied to identify possible interfering factors that could be responsible for between-study variability. Possible publication bias was evaluated with graphical and statistical tests. Effect sizes for ruminal ammonia nitrogen (33 comparisons), pH (42 comparisons) and total volatile fatty acids (38 comparison) did not (P > 0.10) present heterogeneity among studies, and was not affected by yeast supplementation. No effects (P > 0.05) were detected on the stoichiometry of volatile fatty acids or protozoa counts. Effects sizes calculated for digestibility of dry matter, organic matter, crude protein, acid-detergent fibre and neutral-detergent fibre, which included from 17 to 28 comparisons, showed considerable (>50%) between-study variability. This variability could not effectively be explained by the categorical variables (1) mode of yeast application, (2) feed intake (ad lib versus restricted) or (3) faeces collection method, or the continuous independent variables (1) adaptation period, (2) study period, (3) dietary roughage concentration and (4) dietary crude protein concentration. According to random effects models, digestibility of dry matter, organic matter and crude protein were increased by yeast supplementation, with no effects found for digestibility of fibre components. Substantial unexplained between-study variability (50–90%) was found for growth (13 comparisons) and feed intake (9 comparisons). This meta-analysis presented evidence that addition of dry active S. cerevisiae yeast to diets did not have any effect on growth, feed conversion, ruminal parameters or fibre digestibility in sheep.  相似文献   

8.
Sexual isolation is a key component of reproductive isolation, involving mate choice among mature adults. While there are various statistics for estimating sexual isolation from mating frequencies, their ability to produce unbiased estimates varies considerably, depending on the particular situation. We investigated, under different biological scenarios, the estimation properties (statistical bias, efficiency, root mean square, statistical test) of 12 statistics commonly used in the literature for measuring sexual isolation. Yule's Q , V , YA (and related indices) and I PSI are revealed to be the most efficient, with the smallest biases and root mean square deviations. Yule's Q , YA and I PSI show better estimation properties when using infinite sample sizes, while I PSI is preferable using smaller sample sizes. Other statistics investigated should be avoided, at least within the range of conditions considered. Regarding the parametric test of hypothesis, the best alternative is YA . We discuss the advantages and drawbacks of the various estimators, and propose I PSI as the safest for biological sample sizes.  © 2005 The Linnean Society of London, Biological Journal of the Linnean Society , 2005, 85 , 307–318.  相似文献   

9.
ABSTRACT The controversy over the use of null hypothesis statistical testing (NHST) has persisted for decades, yet NHST remains the most widely used statistical approach in wildlife sciences and ecology. A disconnect exists between those opposing NHST and many wildlife scientists and ecologists who conduct and publish research. This disconnect causes confusion and frustration on the part of students. We, as students, offer our perspective on how this issue may be addressed. Our objective is to encourage academic institutions and advisors of undergraduate and graduate students to introduce students to various statistical approaches so we can make well-informed decisions on the appropriate use of statistical tools in wildlife and ecological research projects. We propose an academic course that introduces students to various statistical approaches (e.g., Bayesian, frequentist, Fisherian, information theory) to build a foundation for critical thinking in applying statistics. We encourage academic advisors to become familiar with the statistical approaches available to wildlife scientists and ecologists and thus decrease bias towards one approach. Null hypothesis statistical testing is likely to persist as the most common statistical analysis tool in wildlife science until academic institutions and student advisors change their approach and emphasize a wider range of statistical methods.  相似文献   

10.
Although highly variable loci, such as microsatellite loci, are revolutionizing both evolutionary and conservation biology, data from these loci need to be carefully evaluated. First, because these loci often have very high within-population heterozygosity, the magnitude of differentiation measures may be quite small. For example, maximum GST values for populations with no common alleles at highly variable loci may be small and are at maximum less than the average within-population homozygosity. As a result, measures that are variation independent are recommended for highly variable loci. Second, bottlenecks or a reduction in population size can generate large genetic distances in a short time for these loci. In this case, the genetic distance may be corrected for low variation in a population and tests to detect bottlenecks are advised. Third, statistically significant differences may not reflect biologically meaningful differences both because the patterns of adaptive loci may not be correlated with highly variable loci and statistical power with these markers is so high. As an example of this latter effect, the statistical power to detect a one-generation bottleneck of different sizes for different numbers of highly variable loci is discussed. All of these concerns need to be incorporated in the utilization and interpretation of patterns of highly variable loci for both evolutionary and conservation biology.  相似文献   

11.
Meta-analysis has changed the way researchers conduct literature reviews not only in medical and social sciences but also in biological sciences. Meta-analysis in biological sciences, especially in ecology and evolution (which we refer to as ‘biological’ meta-analysis) faces somewhat different methodological problems from its counterparts in medical and social sciences, where meta-analytic techniques were originally developed. The main reason for such differences is that biological meta-analysis often integrates complex data composed of multiple strata with, for example, different measurements and a variety of species. Here, we review methodological issues and advancements in biological meta-analysis, focusing on three topics: (1) non-independence arising from multiple effect sizes obtained in single studies and from phylogenetic relatedness, (2) detecting and accounting for heterogeneity, and (3) identifying publication bias and measuring its impact. We show how the marriage between mixed-effects (hierarchical/multilevel) models and phylogenetic comparative methods has resolved most of the issues under discussion. Furthermore, we introduce the concept of across-study and within-study meta-analysis, and propose how the use of within-study meta-analysis can improve many empirical studies typical of ecology and evolution.  相似文献   

12.
13.
Use of nuclear magnetic resonance (NMR)-based metabonomics to search for human disease biomarkers is becoming increasingly common. For many researchers, the ultimate goal is translation from biomarker discovery to clinical application. Studies typically involve investigators from diverse educational and training backgrounds, including physicians, academic researchers, and clinical staff. In evaluating potential biomarkers, clinicians routinely use statistical significance testing language, whereas academicians typically use multivariate statistical analysis techniques that do not perform statistical significance evaluation. In this article, we outline an approach to integrate statistical significance testing with conventional principal components analysis data representation. A decision tree algorithm is introduced to select and apply appropriate statistical tests to loadings plot data, which are then heat map color-coded according to P score, enabling direct visual assessment of statistical significance. A multiple comparisons correction must be applied to determine P scores from which reliable inferences can be made. Knowledge of means and standard deviations of statistically significant buckets enabled computation of effect sizes and study sizes for a given statistical power. Methods were demonstrated using data from a previous study. Integrated metabonomics data assessment methodology should facilitate translation of NMR-based metabonomics discovery of human disease biomarkers to clinical use.  相似文献   

14.
Aim  To analyse quantitatively the extent to which several methodological, geographical and taxonomic variables affect the magnitude of the tendency for the latitudinal ranges of species to increase with latitude (the Rapoport effect).
Location  Global.
Methods  A meta-analysis of 49 published studies was used to evaluate the effect of several methodological and biological moderator variables on the magnitude of the pattern.
Results  The method used to depict the latitudinal variation in range sizes is a strong moderator variable that accounts for differences in the magnitude of the pattern. In contrast, the extent of the study or the use of areal or linear estimations of range sizes does not affect the magnitude of the pattern. The effect of geography is more consistent than the effect of taxonomy in accounting for differences in the magnitude of the pattern. The Rapoport effect is indeed strong in Eurasia and North America. Weaker or non-significant latitudinal trends are found at the global scale, and in Australia, South America and the New World. There are no significant differences in the magnitude of the pattern between different habitats, however, the overall pattern is weaker in oceans than in terrestrial regions of the world.
Main conclusions  The Rapoport effect is indeed strong in continental landmasses of the Northern Hemisphere. The magnitude of the effect is primarily affected by methodological and biogeographical factors. Ecological and spatial scale effects seem to be less important. We suggest that not all methodological approaches may be equally useful for analysing the pattern.  相似文献   

15.
Network meta-analysis synthesizes direct and indirect evidence in a network of trials that compare multiple interventions and has the potential to rank the competing treatments according to the studied outcome. Despite its usefulness network meta-analysis is often criticized for its complexity and for being accessible only to researchers with strong statistical and computational skills. The evaluation of the underlying model assumptions, the statistical technicalities and presentation of the results in a concise and understandable way are all challenging aspects in the network meta-analysis methodology. In this paper we aim to make the methodology accessible to non-statisticians by presenting and explaining a series of graphical tools via worked examples. To this end, we provide a set of STATA routines that can be easily employed to present the evidence base, evaluate the assumptions, fit the network meta-analysis model and interpret its results.  相似文献   

16.
Meta-analysis is an increasingly popular tool for combining multiple different genome-wide association studies (GWASs) in a single aggregate analysis in order to identify associations with very small effect sizes. Because the data of a meta-analysis can be heterogeneous, referring to the differences in effect sizes between the collected studies, what is often done in the literature is to apply both the fixed-effects model (FE) under an assumption of the same effect size between studies and the random-effects model (RE) under an assumption of varying effect size between studies. However, surprisingly, RE gives less significant p values than FE at variants that actually show varying effect sizes between studies. This is ironic because RE is designed specifically for the case in which there is heterogeneity. As a result, usually, RE does not discover any associations that FE did not discover. In this paper, we show that the underlying reason for this phenomenon is that RE implicitly assumes a markedly conservative null-hypothesis model, and we present a new random-effects model that relaxes the conservative assumption. Unlike the traditional RE, the new method is shown to achieve higher statistical power than FE when there is heterogeneity, indicating that the new method has practical utility for discovering associations in the meta-analysis of GWASs.  相似文献   

17.
Condition indices for conservation: new uses for evolving tools   总被引:3,自引:0,他引:3  
Biologists have developed a wide range of morphological, biochemicaland physiological metrics to assess the health and, in particular,the energetic status of individual animals. These metrics originatedto quantify aspects of human health, but have also proven usefulto address questions in life history, ecology and resource managementof game and commercial animals. We review the application ofcondition indices (CI) for conservation studies and focus onmeasures that quantify fat reserves, known to be critical forenergetically challenging activities such as migration, reproductionand survival during periods of scarcity. Standard methods scorefat content, or rely on a ratio of body mass rationalized bysome measure of size, usually a linear dimension such as winglength or total body length. Higher numerical values of theseindices are interpreted to mean an animal has greater energyreserves. Such CIs can provide predictive information abouthabitat quality and reproductive output, which in turn can helpmanagers with conservation assessments and policies. We reviewthe issues about the methods and metrics of measurement anddescribe the linkage of CIs to measures of body shape. Debatesin the literature about the best statistical methods to usein computing and comparing CIs remain unresolved. Next, we commenton the diversity of methods used to measure body compositionand the diversity of physiological models that compute bodycomposition and CIs. The underlying physiological regulatorysystems that govern the allocation of energy and nutrients amongcompartments and processes within the body are poorly understood,especially for field situations, and await basic data from additionallaboratory studies and advanced measurement systems includingtelemetry. For now, standard physiological CIs can provide supportingevidence and mechanistic linkages for population studies thathave traditionally been the focus of conservation biology. Physiologistscan provide guidance for the field application of conditionsindices with validation studies and development of new instruments.  相似文献   

18.
生态学假说试验验证的原假说困境   总被引:1,自引:1,他引:0  
李际 《生态学杂志》2016,27(6):2031-2038
试验方法是生态学假说的主要验证方法之一,但也存在由原假说引发的质疑.Quinn和Dunham(1983)通过对Platt(1964)的假说-演绎模型进行分析,主张生态学不可能存在可以严格被试验验证的原假说.Fisher的证伪主义与Neyman-Pearson(N-P)的非判决性使得统计学原假说不能被严格验证;而生态过程中存在的不同于经典物理学的原假说H0(α=1,β=0)与不同的备假说H1′(α′=1,β′=0)的情况,使得生态学原假说也很难得到严格的实验验证.通过降低P值、谨慎选择原假说、对非原假说采取非中心化和双侧验证可分别缓解上述的原假说困境.但统计学的原假说显著性验证(NHST)不应等同于生态学假说中有关因果关系的逻辑证明方法.因此,现有大量基于NHST的生态学假说的方法研究和试验验证的结果与结论都不是绝对的逻辑可靠的.  相似文献   

19.
Fear of falling and other fall-related psychological concerns (FRPCs), such as falls-efficacy and balance confidence, are highly prevalent among community-dwelling older adults. Anxiety and FRPCs have frequently, but inconsistently, been found to be associated in the literature. The purpose of this study is to clarify those inconsistencies with a systematic review and meta-analysis and to evaluate if the strength of this relationship varies based on the different FRPC constructs used (e.g., fear of falling, falls-efficacy or balance confidence). A systematic review was conducted through multiple databases (e.g., MEDLINE, PsycINFO) to include all articles published before June 10th 2015 that measured anxiety and FRPCs in community-dwelling older adults. Active researchers in the field were also contacted in an effort to include unpublished studies. The systematic review led to the inclusion of twenty relevant articles (n = 4738). A random-effect meta-analysis revealed that the mean effect size for fear of falling and anxiety is r = 0.32 (95% CI: 0.22–0.40), Z = 6.49, p < 0.001 and the mean effect size for falls-efficacy or balance confidence and anxiety is r = 0.31 (95% CI: 0.23–0.40), Z = 6.72, p < 0.001. A Q-test for heterogeneity revealed that the two effect sizes are not significantly different (Q(19) = 0.13, p = n.s.). This study is the first meta-analysis on the relationship between anxiety and FRPCs among community-dwelling older adults. It demonstrates the importance of considering anxiety when treating older adults with FRPCs.  相似文献   

20.
A focus on novel, confirmatory, and statistically significant results leads to substantial bias in the scientific literature. One type of bias, known as “p-hacking,” occurs when researchers collect or select data or statistical analyses until nonsignificant results become significant. Here, we use text-mining to demonstrate that p-hacking is widespread throughout science. We then illustrate how one can test for p-hacking when performing a meta-analysis and show that, while p-hacking is probably common, its effect seems to be weak relative to the real effect sizes being measured. This result suggests that p-hacking probably does not drastically alter scientific consensuses drawn from meta-analyses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号