首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.

Background

The removal of outliers to acquire a significant result is a questionable research practice that appears to be commonly used in psychology. In this study, we investigated whether the removal of outliers in psychology papers is related to weaker evidence (against the null hypothesis of no effect), a higher prevalence of reporting errors, and smaller sample sizes in these papers compared to papers in the same journals that did not report the exclusion of outliers from the analyses.

Methods and Findings

We retrieved a total of 2667 statistical results of null hypothesis significance tests from 153 articles in main psychology journals, and compared results from articles in which outliers were removed (N = 92) with results from articles that reported no exclusion of outliers (N = 61). We preregistered our hypotheses and methods and analyzed the data at the level of articles. Results show no significant difference between the two types of articles in median p value, sample sizes, or prevalence of all reporting errors, large reporting errors, and reporting errors that concerned the statistical significance. However, we did find a discrepancy between the reported degrees of freedom of t tests and the reported sample size in 41% of articles that did not report removal of any data values. This suggests common failure to report data exclusions (or missingness) in psychological articles.

Conclusions

We failed to find that the removal of outliers from the analysis in psychological articles was related to weaker evidence (against the null hypothesis of no effect), sample size, or the prevalence of errors. However, our control sample might be contaminated due to nondisclosure of excluded values in articles that did not report exclusion of outliers. Results therefore highlight the importance of more transparent reporting of statistical analyses.  相似文献   

2.
Good quality medical research generally requires not only an expertise in the chosen medical field of interest but also a sound knowledge of statistical methodology. The number of medical research articles which have been published in Indian medical journals has increased quite substantially in the past decade. The aim of this study was to collate all evidence on study design quality and statistical analyses used in selected leading Indian medical journals. Ten (10) leading Indian medical journals were selected based on impact factors and all original research articles published in 2003 (N = 588) and 2013 (N = 774) were categorized and reviewed. A validated checklist on study design, statistical analyses, results presentation, and interpretation was used for review and evaluation of the articles. Main outcomes considered in the present study were – study design types and their frequencies, error/defects proportion in study design, statistical analyses, and implementation of CONSORT checklist in RCT (randomized clinical trials). From 2003 to 2013: The proportion of erroneous statistical analyses did not decrease (χ2=0.592, Φ=0.027, p=0.4418), 25% (80/320) in 2003 compared to 22.6% (111/490) in 2013. Compared with 2003, significant improvement was seen in 2013; the proportion of papers using statistical tests increased significantly (χ2=26.96, Φ=0.16, p<0.0001) from 42.5% (250/588) to 56.7 % (439/774). The overall proportion of errors in study design decreased significantly (χ2=16.783, Φ=0.12 p<0.0001), 41.3% (243/588) compared to 30.6% (237/774). In 2013, randomized clinical trials designs has remained very low (7.3%, 43/588) with majority showing some errors (41 papers, 95.3%). Majority of the published studies were retrospective in nature both in 2003 [79.1% (465/588)] and in 2013 [78.2% (605/774)]. Major decreases in error proportions were observed in both results presentation (χ2=24.477, Φ=0.17, p<0.0001), 82.2% (263/320) compared to 66.3% (325/490) and interpretation (χ2=25.616, Φ=0.173, p<0.0001), 32.5% (104/320) compared to 17.1% (84/490), though some serious ones were still present. Indian medical research seems to have made no major progress regarding using correct statistical analyses, but error/defects in study designs have decreased significantly. Randomized clinical trials are quite rarely published and have high proportion of methodological problems.  相似文献   

3.
In a cross-sectional survey in February-May 2019, the prevalence of Pediculosis capitis with demographic data and the behavioral practices were investigated among 750 participants in the Eastern region of Saudi Arabia. Female participation was highly remarked with a percentage of 94.08% compared to that of male one about 5.91%. A deficiency of knowledge about lice infestation was noted especially among illiterate participants raised from their socio-economic levels (p-value = 0.001). Lice infestation reached higher rates in children aged less than 20 years with itching of the hair scalp. The obtained results revealed that 59.33% of the respondents believed that frequency of personal hygiene and washing of hair were the best methods for preventing the lice infestation. However, the treatment of lice infestation using anti-lice agents (p-value = 0.020) was preferred by 14.26% of participants. Although knowledge about the preventive tools for lice infestation (p-value 0.089) was not significantly associated with the experience of infestation but knowledge about the appropriate treatment to kill lice (p-value 0.020) and the wrong practices in the treatment of a head lice infestation were (p-value 0.005) significantly associated with the experience of infestation. Health programs and campaigns preventions are highly advised to increase the awareness of Pediculosis capitis with an effective strategic plan to control, manage, and prevent this disease.  相似文献   

4.
In this project I investigate the use and possible misuse of p values in papers published in five (high-ranked) journals in experimental psychology. I use a data set of over 135’000 p values from more than five thousand papers. I inspect (1) the way in which the p values are reported and (2) their distribution. The main findings are following: first, it appears that some authors choose the mode of reporting their results in an arbitrary way. Moreover, they often end up doing it in such a way that makes their findings seem more statistically significant than they really are (which is well known to improve the chances for publication). Specifically, they frequently report p values “just above” significance thresholds directly, whereas other values are reported by means of inequalities (e.g. “p<.1”), they round the p values down more eagerly than up and appear to choose between the significance thresholds and between one- and two-sided tests only after seeing the data. Further, about 9.2% of reported p values are inconsistent with their underlying statistics (e.g. F or t) and it appears that there are “too many” “just significant” values. One interpretation of this is that researchers tend to choose the model or include/discard observations to bring the p value to the right side of the threshold.  相似文献   

5.
BackgroundIncreased risk of miscarriage has been reported for women with specific chronic health conditions. A broader investigation of chronic diseases and miscarriage risk may uncover patterns across categories of illness. The objective of this study was to study the risk of miscarriage according to various preexisting chronic diseases.Methods and findingsWe conducted a registry-based study. Registered pregnancies (n = 593,009) in Norway between 2010 and 2016 were identified through 3 national health registries (birth register, general practitioner data, and patient registries). Six broad categories of illness were identified, comprising 25 chronic diseases defined by diagnostic codes used in general practitioner and patient registries. We required that the diseases were diagnosed before the pregnancy of interest. Miscarriage risk according to underlying chronic diseases was estimated as odds ratios (ORs) using generalized estimating equations adjusting for woman’s age. The mean age of women at the start of pregnancy was 29.7 years (SD 5.6 years). We observed an increased risk of miscarriage among women with cardiometabolic diseases (OR 1.25, 95% CI 1.20 to 1.31; p-value <0.001). Within this category, risks were elevated for all conditions: atherosclerosis (2.22; 1.42 to 3.49; p-value <0.001), hypertensive disorders (1.19; 1.13 to 1.26; p-value <0.001), and type 2 diabetes (1.38; 1.26 to 1.51; p-value <0.001). Among other categories of disease, risks were elevated for hypoparathyroidism (2.58; 1.35 to 4.92; p-value 0.004), Cushing syndrome (1.97; 1.06 to 3.65; p-value 0.03), Crohn’s disease (OR 1.31; 95% CI: 1.18 to 1.45; p-value 0.001), and endometriosis (1.22; 1.15 to 1.29; p-value <0.001). Findings were largely unchanged after mutual adjustment. Limitations of this study include our inability to adjust for measures of socioeconomic position or lifestyle characteristics, in addition to the rareness of some of the conditions providing limited power.ConclusionsIn this registry study, we found that, although risk of miscarriage was largely unaffected by maternal chronic diseases, risk of miscarriage was associated with conditions related to cardiometabolic health. This finding is consistent with emerging evidence linking cardiovascular risk factors to pregnancy complications.

In this registry data study, Maria Magnus and colleagues study associations between miscarriage risk and chronic conditions.  相似文献   

6.
The authors evaluate the quality of research reported in major journals in social-personality psychology by ranking those journals with respect to their N-pact Factors (NF)—the statistical power of the empirical studies they publish to detect typical effect sizes. Power is a particularly important attribute for evaluating research quality because, relative to studies that have low power, studies that have high power are more likely to (a) to provide accurate estimates of effects, (b) to produce literatures with low false positive rates, and (c) to lead to replicable findings. The authors show that the average sample size in social-personality research is 104 and that the power to detect the typical effect size in the field is approximately 50%. Moreover, they show that there is considerable variation among journals in sample sizes and power of the studies they publish, with some journals consistently publishing higher power studies than others. The authors hope that these rankings will be of use to authors who are choosing where to submit their best work, provide hiring and promotion committees with a superior way of quantifying journal quality, and encourage competition among journals to improve their NF rankings.  相似文献   

7.
Appropriate study design and proper statistical analysis are necessary ingredients for improving the quality and reliability of the information in journal articles. General surgery and plastic surgery articles were compared for principal author's academic degree, a Ph.D.'s presence as a coauthor, the study type, the presence of statistical analysis, the analysis' appropriateness, and the types of errors in study design or statistical analysis. Ph. D. authorship was associated with increased percentage of articles using statistical analysis. When compared with general surgery articles, plastic surgery articles performed four times fewer statistical analyses. However, when statistical analyses were performed, there were few differences between these two specialties. Although there were no differences in the types of statistical analysis errors, there were differences in the types of study design errors. The causes of these discrepancies may lie in the nature of plastic surgery; they may be reduced by adherence to Feinstein's principles of study design and result interpretation.  相似文献   

8.

Background

The widespread reluctance to share published research data is often hypothesized to be due to the authors'' fear that reanalysis may expose errors in their work or may produce conclusions that contradict their own. However, these hypotheses have not previously been studied systematically.

Methods and Findings

We related the reluctance to share research data for reanalysis to 1148 statistically significant results reported in 49 papers published in two major psychology journals. We found the reluctance to share data to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance.

Conclusions

Our findings on the basis of psychological papers suggest that statistical results are particularly hard to verify when reanalysis is more likely to lead to contrasting conclusions. This highlights the importance of establishing mandatory data archiving policies.  相似文献   

9.
AimTargeted temperature management (TTM) for in-hospital cardiac arrest (IHCA) is given different recommendation levels within international resuscitation guidelines. We aimed to identify whether TTM would be associated with favourable outcomes following IHCA and to determine which factors would influence the decision to implement TTM.MethodsWe conducted a retrospective observational study in a single medical centre. We included adult patients suffering IHCA between 2006 and 2014. We used multivariable logistic regression analysis to evaluate associations between independent variables and outcomes.ResultsWe included a total of 678 patients in our analysis; only 22 (3.2%) patients received TTM. Most (81.1%) patients met at least one exclusion criteria for TTM. In all, 144 (21.2%) patients survived to hospital discharge; among them, 60 (8.8%) patients displayed favourable neurological status at discharge. TTM use was significantly associated with favourable neurological outcome (OR: 3.74, 95% confidence interval [CI]: 1.19–11.00; p-value = 0.02), but it was not associated with survival (OR: 1.41, 95% CI: 0.54–3.66; p-value = 0.48). Arrest in the emergency department was positively associated with TTM use (OR: 22.48, 95% CI: 8.40–67.64; p value < 0.001) and having vasopressors in place at the time of arrest was inversely associated with TTM use (OR: 0.08, 95% CI: 0.004–0.42; p-value = 0.02).ConclusionTTM might be associated with favourable neurological outcome of IHCA patients, irrespective of arrest rhythms. The prevalence of proposed exclusion criteria for TTM was high among IHCA patients, but these factors did not influence the use of TTM in clinical practice or neurological outcomes after IHCA.  相似文献   

10.

Background

q-value is a widely used statistical method for estimating false discovery rate (FDR), which is a conventional significance measure in the analysis of genome-wide expression data. q-value is a random variable and it may underestimate FDR in practice. An underestimated FDR can lead to unexpected false discoveries in the follow-up validation experiments. This issue has not been well addressed in literature, especially in the situation when the permutation procedure is necessary for p-value calculation.

Results

We proposed a statistical method for the conservative adjustment of q-value. In practice, it is usually necessary to calculate p-value by a permutation procedure. This was also considered in our adjustment method. We used simulation data as well as experimental microarray or sequencing data to illustrate the usefulness of our method.

Conclusions

The conservativeness of our approach has been mathematically confirmed in this study. We have demonstrated the importance of conservative adjustment of q-value, particularly in the situation that the proportion of differentially expressed genes is small or the overall differential expression signal is weak.
  相似文献   

11.
BackgroundWe examined whether key sociodemographic and clinical risk factors for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infection and mortality changed over time in a population-based cohort study.Methods and findingsIn a cohort of 9,127,673 persons enrolled in the United States Veterans Affairs (VA) healthcare system, we evaluated the independent associations of sociodemographic and clinical characteristics with SARS-CoV-2 infection (n = 216,046), SARS-CoV-2–related mortality (n = 10,230), and case fatality at monthly intervals between February 1, 2020 and March 31, 2021. VA enrollees had a mean age of 61 years (SD 17.7) and were predominantly male (90.9%) and White (64.5%), with 14.6% of Black race and 6.3% of Hispanic ethnicity. Black (versus White) race was strongly associated with SARS-CoV-2 infection (adjusted odds ratio [AOR] 5.10, [95% CI 4.65 to 5.59], p-value <0.001), mortality (AOR 3.85 [95% CI 3.30 to 4.50], p-value < 0.001), and case fatality (AOR 2.56, 95% CI 2.23 to 2.93, p-value < 0.001) in February to March 2020, but these associations were attenuated and not statistically significant by November 2020 for infection (AOR 1.03 [95% CI 1.00 to 1.07] p-value = 0.05) and mortality (AOR 1.08 [95% CI 0.96 to 1.20], p-value = 0.21) and were reversed for case fatality (AOR 0.86, 95% CI 0.78 to 0.95, p-value = 0.005). American Indian/Alaska Native (AI/AN versus White) race was associated with higher risk of SARS-CoV-2 infection in April and May 2020; this association declined over time and reversed by March 2021 (AOR 0.66 [95% CI 0.51 to 0.85] p-value = 0.004). Hispanic (versus non-Hispanic) ethnicity was associated with higher risk of SARS-CoV-2 infection and mortality during almost every time period, with no evidence of attenuation over time. Urban (versus rural) residence was associated with higher risk of infection (AOR 2.02, [95% CI 1.83 to 2.22], p-value < 0.001), mortality (AOR 2.48 [95% CI 2.08 to 2.96], p-value < 0.001), and case fatality (AOR 2.24, 95% CI 1.93 to 2.60, p-value < 0.001) in February to April 2020, but these associations attenuated over time and reversed by September 2020 (AOR 0.85, 95% CI 0.81 to 0.89, p-value < 0.001 for infection, AOR 0.72, 95% CI 0.62 to 0.83, p-value < 0.001 for mortality and AOR 0.81, 95% CI 0.71 to 0.93, p-value = 0.006 for case fatality). Throughout the observation period, high comorbidity burden, younger age, and obesity were consistently associated with infection, while high comorbidity burden, older age, and male sex were consistently associated with mortality. Limitations of the study include that changes over time in the associations of some risk factors may be affected by changes in the likelihood of testing for SARS-CoV-2 according to those risk factors; also, study results apply directly to VA enrollees who are predominantly male and have comprehensive healthcare and need to be confirmed in other populations.ConclusionsIn this study, we found that strongly positive associations of Black and AI/AN (versus White) race and urban (versus rural) residence with SARS-CoV-2 infection, mortality, and case fatality observed early in the pandemic were ameliorated or reversed by March 2021.

George Ioannou and co-workers study the distribution of SARS-CoV-2 infections and outcomes among the United States population.  相似文献   

12.
Reconstruction of gene regulatory networks based on experimental data usually relies on statistical evidence, necessitating the choice of a statistical threshold which defines a significant biological effect. Approaches to this problem found in the literature range from rigorous multiple testing procedures to ad hoc P-value cut-off points. However, when the data implies graphical structure, it should be possible to exploit this feature in the threshold selection process. In this article we propose a procedure based on this principle. Using coding theory we devise a measure of graphical structure, for example, highly connected nodes or chain structure. The measure for a particular graph can be compared to that of a random graph and structure inferred on that basis. By varying the statistical threshold the maximum deviation from random structure can be estimated, and the threshold is then chosen on that basis. A global test for graph structure follows naturally.  相似文献   

13.
Estimates of past forest composition obtained from Late Quaternary pollen spectra via a calibration of modern pollen spectra in terms of species abundances are subject to various sources of error, whose combined effect requires statistical analysis. Two statistical procedures, the maximum likelihood method and an approach using series expansions, are used to estimate standard deviations associated with forest composition estimates obtained via the R-value method of calibration. The two approaches yield similar values. The series expansion method also allows one to allocate pollen counting effort between fossil and modern samples in such a way as to maximize the precision of the final estimates. M.B. Davis's original controversial estimates of early Holocene forest composition in Vermont, U.S.A., are shown to have been vitiated by statistical errors. The optimum allocation procedure here suggests increasing the relative effort put into the modern count. This change would have improved but not rescued the estimates; omission of Larix, however, led to a substantial reduction in the errors. Exceptionally poor pollen producers such as Larix should generally be excluded from quantitative calibration; the remaining taxa should be calibrated on the basis of large samples of pollen, the modern pollen being collected preferably from a network of surface sampling sites.  相似文献   

14.

Background

Litigation documents reveal that pharmaceutical companies have paid physicians to promote off-label uses of their products through a number of different avenues. It is unknown whether physicians and scientists who have such conflicts of interest adequately disclose such relationships in the scientific publications they author.

Methods and Findings

We collected whistleblower complaints alleging illegal off-label marketing from the US Department of Justice and other publicly available sources (date range: 1996–2010). We identified physicians and scientists described in the complaints as having financial relationships with defendant manufacturers, then searched Medline for articles they authored in the subsequent three years. We assessed disclosures made in articles related to the off-label use in question, determined the frequency of adequate disclosure statements, and analyzed characteristics of the authors (specialty, author position) and articles (type, connection to off-label use, journal impact factor, citation count/year). We identified 39 conflicted individuals in whistleblower complaints. They published 404 articles related to the drugs at issue in the whistleblower complaints, only 62 (15%) of which contained an adequate disclosure statement. Most articles had no disclosure (43%) or did not mention the pharmaceutical company (40%). Adequate disclosure rates varied significantly by article type, with commentaries less likely to have adequate disclosure compared to articles reporting original studies or trials (adjusted odds ratio [OR] = 0.10, 95%CI = 0.02–0.67, p = 0.02). Over half of the authors (22/39, 56%) made no adequate disclosures in their articles. However, four of six authors with ≥25 articles disclosed in about one-third of articles (range: 10/36–8/25 [28%–32%]).

Conclusions

One in seven authors identified in whistleblower complaints as involved in off-label marketing activities adequately disclosed their conflict of interest in subsequent journal publications. This is a much lower rate of adequate disclosure than has been identified in previous studies. The non-disclosure patterns suggest shortcomings with authors and the rigor of journal practices. Please see later in the article for the Editors'' Summary  相似文献   

15.
Over recent years many statisticians and researchers have highlighted that statistical inference would benefit from a better use and understanding of hypothesis testing, p-values, and statistical significance. We highlight three recommendations in the context of biochemical sciences. First recommendation: to improve the biological interpretation of biochemical data, do not use p-values (or similar test statistics) as thresholded values to select biomolecules. Second recommendation: to improve comparison among studies and to achieve robust knowledge, perform complete reporting of data. Third recommendation: statistical analyses should be reported completely with exact numbers (not as asterisks or inequalities). Owing to the high number of variables, a better use of statistics is of special importance in omic studies.  相似文献   

16.
This article describes a systematic analysis of the relationship between empirical data and theoretical conclusions for a set of experimental psychology articles published in the journal Science between 2005–2012. When the success rate of a set of empirical studies is much higher than would be expected relative to the experiments'' reported effects and sample sizes, it suggests that null findings have been suppressed, that the experiments or analyses were inappropriate, or that the theory does not properly follow from the data. The analyses herein indicate such excess success for 83% (15 out of 18) of the articles in Science that report four or more studies and contain sufficient information for the analysis. This result suggests a systematic pattern of excess success among psychology articles in the journal Science.  相似文献   

17.
BackgroundConsumption of sugar-sweetened beverages (SSBs) has been consistently associated with a higher risk of obesity, type 2 diabetes, cardiovascular disease, and premature mortality, whereas evidence for artificially sweetened beverages (ASBs) and fruit juices on health is less solid. The aim of this study was to evaluate the consumption of SSBs, ASBs, and fruit juices in association with frailty risk among older women.Methods and findingsWe analyzed data from 71,935 women aged ≥60 (average baseline age was 63) participating in the Nurses’ Health Study (NHS), an ongoing cohort study initiated in 1976 among female registered nurses in the United States. Consumption of beverages was derived from 6 repeated food frequency questionnaires (FFQs) administered between 1990 and 2010. Frailty was defined as having at least 3 of the following 5 criteria from the FRAIL scale: fatigue, poor strength, reduced aerobic capacity, having ≥5 chronic illnesses, and weight loss ≥5%. The occurrence of frailty was assessed every 4 years from 1992 to 2014. During 22 years of follow-up, we identified 11,559 incident cases of frailty. Consumption of SSBs was associated with higher risk of frailty after adjustment for diet quality, body mass index (BMI), smoking status, and medication use, specifically, the relative risks (RRs) and 95% confidence interval (95% CI) for ≥2 serving/day versus no SSB consumption was 1.32 (1.10, 1.57); p-value <0.001. ASBs were also associated with frailty [RR ≥2 serving/day versus no consumption: 1.28 (1.17, 1.39); p-value <0.001]. Orange juice was associated with lower risk of frailty [RR ≥1 serving/day versus no consumption: 0.82 (0.76, 0.87); p-value <0.001], whereas other juices were associated with a slightly higher risk [RR ≥1 serving/day versus no consumption: 1.15 (1.03, 1.28); p-value <0.001]. A limitation of this study is that, due to self-reporting of diet and frailty, certain misclassification bias cannot be ruled out; also, some residual confounding may persist.ConclusionsIn this study, we observed that consumption of SSBs and ASBs was associated with a higher risk of frailty. However, orange juice intake showed an inverse association with frailty. These results need to be confirmed in further studies using other frailty definitions.

Ellen Struijk and colleagues investigate the association between sweetened beverage consumption and risk of frailty later in life.  相似文献   

18.
Current metrics for estimating a scientist’s academic performance treat the author’s publications as if these were solely attributable to the author. However, this approach ignores the substantive contributions of co-authors, leading to misjudgments about the individual’s own scientific merits and consequently to misallocation of funding resources and academic positions. This problem is becoming the more urgent in the biomedical field where the number of collaborations is growing rapidly, making it increasingly harder to support the best scientists. Therefore, here we introduce a simple harmonic weighing algorithm for correcting citations and citation-based metrics such as the h-index for co-authorships. This weighing algorithm can account for both the nvumber of co-authors and the sequence of authors on a paper. We then derive a measure called the ‘profit (p)-index’, which estimates the contribution of co-authors to the work of a given author. By using samples of researchers from a renowned Dutch University hospital, Spinoza Prize laureates (the most prestigious Dutch science award), and Nobel Prize laureates in Physiology or Medicine, we show that the contribution of co-authors to the work of a particular author is generally substantial (i.e., about 80%) and that researchers’ relative rankings change materially when adjusted for the contributions of co-authors. Interestingly, although the top University hospital researchers had the highest h-indices, this appeared to be due to their significantly higher p-indices. Importantly, the ranking completely reversed when using the profit adjusted h-indices, with the Nobel laureates having the highest, the Spinoza Prize laureates having an intermediate, and the top University hospital researchers having the lowest profit adjusted h-indices, respectively, suggesting that exceptional researchers are characterized by a relatively high degree of scientific independency/originality. The concepts and methods introduced here may thus provide a more fair impression of a scientist’s autonomous academic performance.  相似文献   

19.
20.

Objective

Biomedical literature is increasingly enriched with literature reviews and meta-analyses. We sought to assess the understanding of statistical terms routinely used in such studies, among researchers.

Methods

An online survey posing 4 clinically-oriented multiple-choice questions was conducted in an international sample of randomly selected corresponding authors of articles indexed by PubMed.

Results

A total of 315 unique complete forms were analyzed (participation rate 39.4%), mostly from Europe (48%), North America (31%), and Asia/Pacific (17%). Only 10.5% of the participants answered correctly all 4 “interpretation” questions while 9.2% answered all questions incorrectly. Regarding each question, 51.1%, 71.4%, and 40.6% of the participants correctly interpreted statistical significance of a given odds ratio, risk ratio, and weighted mean difference with 95% confidence intervals respectively, while 43.5% correctly replied that no statistical model can adjust for clinical heterogeneity. Clinicians had more correct answers than non-clinicians (mean score ± standard deviation: 2.27±1.06 versus 1.83±1.14, p<0.001); among clinicians, there was a trend towards a higher score in medical specialists (2.37±1.07 versus 2.04±1.04, p = 0.06) and a lower score in clinical laboratory specialists (1.7±0.95 versus 2.3±1.06, p = 0.08). No association was observed between the respondents'' region or questionnaire completion time and participants'' score.

Conclusion

A considerable proportion of researchers, randomly selected from a diverse international sample of biomedical scientists, misinterpreted statistical terms commonly reported in meta-analyses. Authors could be prompted to explicitly interpret their findings to prevent misunderstandings and readers are encouraged to keep up with basic biostatistics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号