期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The Use of Bayesian Networks to Assess the Quality of Evidence from Research Synthesis: 1.

Gavin B. Stewart Julian P. T. Higgins Holger Schünemann Nick Meader 《PloS one》2015,10(4)

Background

The grades of recommendation, assessment, development and evaluation (GRADE) approach is widely implemented in systematic reviews, health technology assessment and guideline development organisations throughout the world. A key advantage to this approach is that it aids transparency regarding judgments on the quality of evidence. However, the intricacies of making judgments about research methodology and evidence make the GRADE system complex and challenging to apply without training.

Methods

We have developed a semi-automated quality assessment tool (SAQAT) l based on GRADE. This is informed by responses by reviewers to checklist questions regarding characteristics that may lead to unreliability. These responses are then entered into the Bayesian network to ascertain the probabilities of risk of bias, inconsistency, indirectness, imprecision and publication bias conditional on review characteristics. The model then combines these probabilities to provide a probability for each of the GRADE overall quality categories. We tested the model using a range of plausible scenarios that guideline developers or review authors could encounter.

Results

Overall, the model reproduced GRADE judgements for a range of scenarios. Potential advantages over standard assessment are use of explicit and consistent weightings for different review characteristics, forcing consideration of important but sometimes neglected characteristics and principled downgrading where small but important probabilities of downgrading are accrued across domains.

Conclusions

Bayesian networks have considerable potential for use as tools to assess the validity of research evidence. The key strength of such networks lies in the provision of a statistically coherent method for combining probabilities across a complex framework based on both belief and evidence. In addition to providing tools for less experienced users to implement reliability assessment, the potential for sensitivity analyses and automation may be beneficial for application and the methodological development of reliability tools. 相似文献

2.

Computer-Assisted Interpretation of the EEG Background Pattern: A Clinical Evaluation

Shaun S. Lodder Jessica Askamp Michel J. A. M. van Putten 《PloS one》2014,9(1)

Objective

Interpretation of the EEG background pattern in routine recordings is an important part of clinical reviews. We evaluated the feasibility of an automated analysis system to assist reviewers with evaluation of the general properties in the EEG background pattern.

Methods

Quantitative EEG methods were used to describe the following five background properties: posterior dominant rhythm frequency and reactivity, anterior-posterior gradients, presence of diffuse slow-wave activity and asymmetry. Software running the quantitative methods were given to ten experienced electroencephalographers together with 45 routine EEG recordings and computer-generated reports. Participants were asked to review the EEGs by visual analysis first, and afterwards to compare their findings with the generated reports and correct mistakes made by the system. Corrected reports were returned for comparison.

Results

Using a gold-standard derived from the consensus of reviewers, inter-rater agreement was calculated for all reviewers and for automated interpretation. Automated interpretation together with most participants showed high (kappa > 0.6) agreement with the gold standard. In some cases, automated analysis showed higher agreement with the gold standard than participants. When asked in a questionnaire after the study, all participants considered computer-assisted interpretation to be useful for every day use in routine reviews.

Conclusions

Automated interpretation methods proved to be accurate and were considered to be useful by all participants.

Significance

Computer-assisted interpretation of the EEG background pattern can bring consistency to reviewing and improve efficiency and inter-rater agreement. 相似文献

3.

From the trenches: a cross-sectional study applying the GRADE tool in systematic reviews of healthcare interventions

Hartling L Fernandes RM Seida J Vandermeer B Dryden DM 《PloS one》2012,7(4):e34697

相似文献

4.

Gynecomastia in Patients with Prostate Cancer: A Systematic Review

Anders Fagerlund Luigi Cormio Lina Palangi Richard Lewin Fabio Santanelli di Pompeo Anna Elander Gennaro Selvaggi 《PloS one》2015,10(8)

Introduction

Gynecomastia and/or mastodynia is a common medical problem in patients receiving antiandrogen (bicalutamide or flutamide) treatment for prostate cancer; up to 70% of these patients result to be affected; furthermore, this can jeopardise patients’ quality of life.

Aims

To systematically review the quality of evidence of the current literature regarding treatment options for bicalutamide-induced gynecomastia, including efficacy, safety and patients’ quality of life.

Methods

The PubMed, Medline, Scopus, The Cochrane Library and SveMed+ databases were systematically searched between January 1, 2000 and December 31, 2014. All searches were undertaken between January and February 2015. The search phrase used was:”gynecomastia AND treatment AND prostate cancer”. Two reviewers assessed 762 titles and abstracts identified. The search and review process was done in accordance with the PRISMA statement. The PICOS (patients, intervention, comparator, outcomes and study design) process was used to specify inclusion criteria. Quality of evidence was rated according to GRADE.

Main Outcome Measures

Primary outcomes were: treatment effects, number of complications and side effects. Secondary outcome was: Quality of Life.

Results

Eleven studies met the inclusion criteria and are analysed in this review. Five studies reported pharmacological intervention with tamoxifen and/or anastrozole, either as prophylactic or therapeutic treatment. Four studies reported radiotherapy as prophylactic and/or therapeutic treatment. Two studies compared pharmacological treatment to radiotherapy. Most of the studies were randomized with varying risk of bias. According to GRADE, quality of evidence was moderate to high.

Conclusions

Bicalutamide-induced gynecomastia and/or mastodynia can effectively be managed by oral tamoxifen (10–20 mg daily) or radiotherapy without relevant side effects. Prophylaxis or therapeutic treatment with tamoxifen results to be more effective than radiotherapy. 相似文献

5.

Complete Preoperative Evaluation of Pulmonary Atresia with Ventricular Septal Defect with Multi-Detector Computed Tomography

Jingzhe Liu Hongyin Li Zhibo Liu Qingyu Wu Yufeng Xu 《PloS one》2016,11(1)

Objective

To compare multi-detector computed tomography (MDCT) with cardiac catheterization and transthoracic echocardiography (TTE) in comprehensive evaluation of the global cardiovascular anatomy in patients with pulmonary atresia with ventricular septal defect (PA-VSD).

Methods

The clinical and imaging data of 116 patients with PA-VSD confirmed by surgery were reviewed. Using findings at surgery as the reference standard, data from MDCT, TTE and catheterization were reviewed for assessment of native pulmonary vasculature and intracardiac defects.

Results

MDCT was more accurate than catheterization and TTE in identification of native pulmonary arteries. MDCT is also the most accurate test for delineation of the major aortopulmonary collateral arteries. The inter-modality agreement for evaluation of overriding aorta and VSD were both excellent. In the subgroup with surgical correlation, excellent agreement was found between TTE and surgery, and substantial agreement was also found at MDCT.

Conclusion

MDCT can correctly delineate the native pulmonary vasculatures and intracardiac defects and may be a reliable method for noninvasive assessment of global cardiovascular abnormalities in patients with PA-VSD. 相似文献

6.

Reliability and Validity Study of Clinical Ultrasound Imaging on Lateral Curvature of Adolescent Idiopathic Scoliosis

Q. Wang M. Li Edmond H. M. Lou M. S. Wong 《PloS one》2015,10(8)

Background

Non-ionizing radiation imaging assessment has been advocated for the patients with adolescent idiopathic scoliosis (AIS). As one of the radiation-free methods, ultrasound imaging has gained growing attention in scoliosis assessment over the past decade. The center of laminae (COL) method has been proposed to measure the spinal curvature in the coronal plane of ultrasound image. However, the reliability and validity of this ultrasound method have not been validated in the clinical setting.

Objectives

To evaluate the reliability and validity of clinical ultrasound imaging on lateral curvature measurements of AIS with their corresponding magnetic resonance imaging (MRI) measurements.

Methods

Thirty curves (ranged 10.2°–68.2°) from sixteen patients with AIS were eligible for this study. The ultrasound scan was performed using a 3-D ultrasound unit within the same morning of MRI examination. Two researchers were involved in data collection of these two examinations. The COL method was used to measure the coronal curvature in ultrasound image, compared with the Cobb method in MRI. The intra- and inter-rater reliability of the COL method was evaluated by intra-class correlation coefficient (ICC). The validity of this method was analyzed by paired Student’s t-test, Bland–Altman statistics and Pearson correlation coefficient. The level of significance was set as 0.05.

Results

The COL method showed high intra- and inter-rater reliabilities (both with ICC (2, K) >0.9, p<0.05) to measure the coronal curvature. Compared with Cobb method, COL method showed no significant difference (p<0.05) when measuring coronal curvature. Furthermore, Bland-Altman method demonstrated an agreement between these two methods, and Pearson’s correlation coefficient (r) was high (r>0.9, p<0.05).

Conclusion

The ultrasound imaging could provide a reliable and valid measurement of spinal curvature in the coronal plane using the COL method. Further research is needed to validate the proposed ultrasound measurement in larger clinical trial and to optimize the ultrasound scanning and measuring procedure. 相似文献

7.

Retinal Arteriolar Morphometry Based on Full Width at Half Maximum Analysis of Spectral-Domain Optical Coherence Tomography Images

Yu Hua Tong Tie Pei Zhu Ze Lin Zhao Hai Jing Zhan Fang Zheng Jiang Heng Li Lian 《PloS one》2015,10(12)

Objectives

In this study, we develop a microdensitometry method using full width at half maximum (FWHM) analysis of the retinal vascular structure in a spectral-domain optical coherence tomography (SD-OCT) image and present the application of this method in the morphometry of arteriolar changes during hypertension.

Methods

Two raters using manual and FWHM methods measured retinal vessel outer and lumen diameters in SD-OCT images. Inter-rater reproducibility was measured using coefficients of variation (CV), intraclass correlation coefficient and a Bland-Altman plot. OCT images from forty-three eyes of 43 hypertensive patients and 40 eyes of 40 controls were analyzed using an FWHM approach; wall thickness, wall cross-sectional area (WCSA) and wall to lumen ratio (WLR) were subsequently calculated.

Results

Mean difference in inter-rater agreement ranged from -2.713 to 2.658 μm when using a manual method, and ranged from -0.008 to 0.131 μm when using a FWHM approach. The inter-rater CVs were significantly less for the FWHM approach versus the manual method (P < 0.05). Compared with controls, the wall thickness, WCSA and WLR of retinal arterioles were increased in the hypertensive patients, particular in diabetic hypertensive patients.

Conclusions

The microdensitometry method using a FWHM algorithm markedly improved inter-rater reproducibility of arteriolar morphometric analysis, and SD-OCT may represent a promising noninvasive method for in vivo arteriolar morphometry. 相似文献

8.

Medical Decision-Making Incapacity among Newly Diagnosed Older Patients with Hematological Malignancy Receiving First Line Chemotherapy: A Cross-Sectional Study of Patients and Physicians

Koji Sugano Toru Okuyama Shinsuke Iida Hirokazu Komatsu Takashi Ishida Shigeru Kusumoto Megumi Uchida Tomohiro Nakaguchi Yosuke Kubota Yoshinori Ito Kazuhisa Takahashi Tatsuo Akechi 《PloS one》2015,10(8)

Background

Decision-making capacity to provide informed consent regarding treatment is essential among cancer patients. The purpose of this study was to identify the frequency of decision-making incapacity among newly diagnosed older patients with hematological malignancy receiving first-line chemotherapy, to examine factors associated with incapacity and assess physicians’ perceptions of patients’ decision-making incapacity.

Methods

Consecutive patients aged 65 years or over with a primary diagnosis of malignant lymphoma or multiple myeloma were recruited. Decision-making capacity was assessed using the Structured Interview for Competency and Incompetency Assessment Testing and Ranking Inventory-Revised (SICIATRI-R). Cognitive impairment, depressive condition and other possible associated factors were also evaluated.

Results

Among 139 eligible patients registered for this study, 114 completed the survey. Of these, 28 (25%, 95% confidence interval [CI]: 17%-32%) were judged as having some extent of decision-making incompetency according to SICIATRI-R. Higher levels of cognitive impairment and increasing age were significantly associated with decision-making incapacity. Physicians experienced difficulty performing competency assessment (Cohen’s kappa -0.54).

Conclusions

Decision-making incapacity was found to be a common and under-recognized problem in older patients with cancer. Age and assessment of cognitive impairment may provide the opportunity to find patients that are at a high risk of showing decision-making incapacity. 相似文献

9.

The OSCAR-IB consensus criteria for retinal OCT quality assessment

Tewarie P Balk L Costello F Green A Martin R Schippling S Petzold A 《PloS one》2012,7(4):e34823

Background

Retinal optical coherence tomography (OCT) is an imaging biomarker for neurodegeneration in multiple sclerosis (MS). In order to become validated as an outcome measure in multicenter studies, reliable quality control (QC) criteria with high inter-rater agreement are required.

Methods/Principal Findings

A prospective multicentre study on developing consensus QC criteria for retinal OCT in MS: (1) a literature review on OCT QC criteria; (2) application of these QC criteria to a training set of 101 retinal OCT scans from patients with MS; (3) kappa statistics for inter-rater agreement; (4) identification reasons for inter-rater disagreement; (5) development of new consensus QC criteria; (6) testing of the new QC criteria on the training set and (7) prospective validation on a new set of 159 OCT scans from patients with MS. The inter-rater agreement for acceptable scans among OCT readers (n = 3) was moderate (kappa 0·45) based on the non-validated QC criteria which were entirely based on the ophthalmological literature. A new set of QC criteria was developed based on recognition of: (O) obvious problems, (S) poor signal strength, (C) centration of scan, (A) algorithm failure, (R) retinal pathology other than MS related, (I) illumination and (B) beam placement. Adhering to these OSCAR-IB QC criteria increased the inter-rater agreement to kappa from moderate to substantial (0.61 training set and 0.61 prospective validation).

Conclusions

This study presents the first validated consensus QC criteria for retinal OCT reading in MS. The high inter-rater agreement suggests the OSCAR-IB QC criteria to be considered in the context of multicentre studies and trials in MS. 相似文献

10.

Interventions to Improve Adherence in Patients with Immune-Mediated Inflammatory Disorders: A Systematic Review

Fanny Depont Francis Berenbaum Jérome Filippi Michel Le Maitre Henri Nataf Carle Paul Laurent Peyrin-Biroulet Emmanuel Thibout 《PloS one》2015,10(12)

Background

In patients with immune-mediated inflammatory disorders, poor adherence to medication is associated with increased healthcare costs, decreased patient satisfaction, reduced quality of life and unfavorable treatment outcomes.

Objective

To determine the impact of different interventions on medication adherence in patients with immune-mediated inflammatory disorders.

Design

Systematic review.

Data sources

MEDLINE, EMBASE and Cochrane Library.

Study eligibility criteria for selecting studies

Included studies were clinical trials and observational studies in adult outpatients treated for psoriasis, Crohn’s disease, ulcerative colitis, rheumatoid arthritis, spondyloarthritis, psoriatic arthritis or multiple sclerosis.

Study appraisal and synthesis methods

Intervention approaches were classified into four categories: educational, behavioral, cognitive behavioral, and multicomponent interventions. The risk of bias/study limitations of each study was assessed using the GRADE system.

Results

Fifteen studies (14 clinical trials and one observational study) met eligibility criteria and enrolled a total of 1958 patients. Forty percent of the studies (6/15) was conducted in patients with inflammatory bowel disease, half (7/15) in rheumatoid arthritis patients, one in psoriasis patients and one in multiple sclerosis patients. Seven out of 15 interventions were classified as multicomponent, four as educational, two as behavioral and two as cognitive behavioral. Nine studies, of which five were multicomponent interventions, had no serious limitations according to GRADE criteria. Nine out of 15 interventions showed an improvement of adherence: three multicomponent interventions in inflammatory bowel disease; one intervention of each category in rheumatoid arthritis; one multicomponent in psoriasis and one multicomponent in multiple sclerosis.

Conclusion

The assessment of interventions designed for increasing medication adherence in IMID is rare in the literature and their methodological quality may be improved in upcoming studies. Nonetheless, multicomponent interventions showed the strongest evidence for promoting adherence in patients with IMID. 相似文献

11.

Treatment Effects of Removable Functional Appliances in Pre-Pubertal and Pubertal Class II Patients: A Systematic Review and Meta-Analysis of Controlled Studies

Giuseppe Perinetti Jasmina Primo?i? Lorenzo Franchi Luca Contardo 《PloS one》2015,10(10)

Background

Treatment effects of removable functional appliances in Class II malocclusion patients according to the pre-pubertal or pubertal growth phase has yet to be clarified.

Objectives

To assess and compare skeletal and dentoalveolar effects of removable functional appliances in Class II malocclusion treatment between pre-pubertal and pubertal patients.

Search methods

Literature survey using the Medline, SCOPUS, LILACS and SciELO databases, the Cochrane Library from inception to May 31, 2015. A manual search was also performed.

Selection criteria

Randomised (RCTs) or controlled clinical trials with a matched untreated control group. No restrictions were set regarding the type of removable appliance whenever used alone.

Data collection and analysis

For the meta-analysis, cephalometric parameters on the supplementary mandibular growth were the main outcomes, with other cephalometric parameters considered as secondary outcomes. Risk of bias in individual and across studies were evaluated along with sensitivity analysis for low quality studies. Mean differences and 95% confidence intervals for annualised changes were computed according to a random model. Differences between pre-pubertal and pubertal patients were assessed by subgroup analyses. GRADE assessment was performed for the main outcomes.

Results

Twelve articles (but only 3 RCTs) were included accounting for 8 pre-pubertal and 7 pubertal groups. Overall supplementary total mandibular length and mandibular ramus height were 0.95 mm (0.38, 1.51) and 0.00 mm (-0.52, 0.53) for pre-pubertal patients and 2.91 mm (2.04, 3.79) and 2.18 mm (1.51, 2.86) for pubertal patients, respectively. The subgroup difference was significant for both parameters (p<0.001). No maxillary growth restrain or increase in facial divergence was seen in either subgroup. The GRADE assessment was low for the pre-pubertal patients, and generally moderate for the pubertal patients.

Conclusions

Taking into account the limited quality and heterogeneity of the included studies, functional treatment by removable appliances may be effective in treating Class II malocclusion with clinically relevant skeletal effects if performed during the pubertal growth phase. 相似文献

12.

Assessing Social Networks in Patients with Psychotic Disorders: A Systematic Review of Instruments

Joyce Siette Claudia Gulea Stefan Priebe 《PloS one》2015,10(12)

Background

Evidence suggests that social networks of patients with psychotic disorders influence symptoms, quality of life and treatment outcomes. It is therefore important to assess social networks for which appropriate and preferably established instruments should be used.

Aims

To identify instruments assessing social networks in studies of patients with psychotic disorders and explore their properties.

Method

A systematic search of electronic databases was conducted to identify studies that used a measure of social networks in patients with psychotic disorders.

Results

Eight instruments were identified, all of which had been developed before 1991. They have been used in 65 studies (total N of patients = 8,522). They assess one or more aspects of social networks such as their size, structure, dimensionality and quality. Most instruments have various shortcomings, including questionable inter-rater and test-retest reliability.

Conclusions

The assessment of social networks in patients with psychotic disorders is characterized by a variety of approaches which may reflect the complexity of the construct. Further research on social networks in patients with psychotic disorders would benefit from advanced and more precise instruments using comparable definitions of and timescales for social networks across studies. 相似文献

13.

Assessing Communication Skills of Medical Students in Objective Structured Clinical Examinations (OSCE) - A Systematic Review of Rating Scales

Musa C?mert J?rdis Maria Zill Eva Christalle J?rg Dirmaier Martin H?rter Isabelle Scholl 《PloS one》2016,11(3)

Background

Teaching and assessment of communication skills have become essential in medical education. The Objective Structured Clinical Examination (OSCE) has been found as an appropriate means to assess communication skills within medical education. Studies have demonstrated the importance of a valid assessment of medical students’ communication skills. Yet, the validity of the performance scores depends fundamentally on the quality of the rating scales used in an OSCE. Thus, this systematic review aimed at providing an overview of existing rating scales, describing their underlying definition of communication skills, determining the methodological quality of psychometric studies and the quality of psychometric properties of the identified rating scales.

Methods

We conducted a systematic review to identify psychometrically tested rating scales, which have been applied in OSCE settings to assess communication skills of medical students. Our search strategy comprised three databases (EMBASE, PsycINFO, and PubMed), reference tracking and consultation of experts. We included studies that reported psychometric properties of communication skills assessment rating scales used in OSCEs by examiners only. The methodological quality of included studies was assessed using the COnsensus based Standards for the selection of health status Measurement INstruments (COSMIN) checklist. The quality of psychometric properties was evaluated using the quality criteria of Terwee and colleagues.

Results

Data of twelve studies reporting on eight rating scales on communication skills assessment in OSCEs were included. Five of eight rating scales were explicitly developed based on a specific definition of communication skills. The methodological quality of studies was mainly poor. The psychometric quality of the eight rating scales was mainly intermediate.

Discussion

Our results reveal that future psychometric evaluation studies focusing on improving the methodological quality are needed in order to yield psychometrically sound results of the OSCEs assessing communication skills. This is especially important given that most OSCE rating scales are used for summative assessment, and thus have an impact on medical students’ academic success. 相似文献

14.

Clinical application of cine-MRI in the visual assessment of mitral regurgitation compared to echocardiography and cardiac catheterization

J Heitner GP Bhumireddy AL Crowley J Weinsaft SA Haq I Klem RJ Kim JG Jollis 《PloS one》2012,7(7):e40491

Background

Detecting and quantifying the severity of mitral regurgitation is essential for risk stratification and clinical decision-making regarding timing of surgery. Our objective was to assess specific visual parameters by cine-magnetic resonance imaging (MRI) in the determination of the severity of mitral regurgitation and to compare it to previously validated imaging modalities: echocardiography and cardiac ventriculography.

Methods

The study population consisted of 68 patients who underwent a cardiac MRI followed by an echocardiogram within a median time of 2.0 days and 49 of these patients who had a cardiac catheterization, median time of 2.0 days. The inter-rater agreement statistic (Kappa) was used to evaluate the agreement.

Results

There was moderate agreement between cine MRI and Doppler echocardiography in assessing mitral regurgitation severity, with a kappa value of 0.47, confidence interval (CI) 0.29–0.65. There was also fair agreement between cine MRI and cardiac catheterization with a kappa value of 0.36, CI of 0.17–0.55.

Conclusion

Cine MRI offers a reasonable alternative to both Doppler echocardiography and, to a lesser extent, cardiac catheterization for visually assessing the severity of mitral regurgitation with specific visual parameters during routine clinical cardiac MRI. 相似文献

15.

Development and inter-rater reliability of the Liverpool adverse drug reaction causality assessment tool

Gallagher RM Kirkham JJ Mason JR Bird KA Williamson PR Nunn AJ Turner MA Smyth RL Pirmohamed M 《PloS one》2011,6(12):e28096

Aim

To develop and test a new adverse drug reaction (ADR) causality assessment tool (CAT).

Methods

A comparison between seven assessors of a new CAT, formulated by an expert focus group, compared with the Naranjo CAT in 80 cases from a prospective observational study and 37 published ADR case reports (819 causality assessments in total).

Main Outcome Measures

Utilisation of causality categories, measure of disagreements, inter-rater reliability (IRR).

Results

The Liverpool ADR CAT, using 40 cases from an observational study, showed causality categories of 1 unlikely, 62 possible, 92 probable and 125 definite (1, 62, 92, 125) and ‘moderate’ IRR (kappa 0.48), compared to Naranjo (0, 100, 172, 8) with ‘moderate’ IRR (kappa 0.45). In a further 40 cases, the Liverpool tool (0, 66, 81, 133) showed ‘good’ IRR (kappa 0.6) while Naranjo (1, 90, 185, 4) remained ‘moderate’.

Conclusion

The Liverpool tool assigns the full range of causality categories and shows good IRR. Further assessment by different investigators in different settings is needed to fully assess the utility of this tool. 相似文献

16.

The Missing Medians: Exclusion of Ordinal Data from Meta-Analyses

Toby B. Cumming Leonid Churilov Emily S. Sena 《PloS one》2015,10(12)

Background

Meta-analyses are considered the gold standard of evidence-based health care, and are used to guide clinical decisions and health policy. A major limitation of current meta-analysis techniques is their inability to pool ordinal data. Our objectives were to determine the extent of this problem in the context of neurological rating scales and to provide a solution.

Methods

Using an existing database of clinical trials of oral neuroprotective therapies, we identified the 6 most commonly used clinical rating scales and recorded how data from these scales were reported and analysed. We then identified systematic reviews of studies that used these scales (via the Cochrane database) and recorded the meta-analytic techniques used. Finally, we identified a statistical technique for calculating a common language effect size measure for ordinal data.

Results

We identified 103 studies, with 128 instances of the 6 clinical scales being reported. The majority– 80%–reported means alone for central tendency, with only 13% reporting medians. In analysis, 40% of studies used parametric statistics alone, 34% of studies employed non-parametric analysis, and 26% did not include or specify analysis. Of the 60 systematic reviews identified that included meta-analysis, 88% used mean difference and 22% employed difference in proportions; none included rank-based analysis. We propose the use of a rank-based generalised odds ratio (WMW GenOR) as an assumption-free effect size measure that is easy to compute and can be readily combined in meta-analysis.

Conclusion

There is wide scope for improvement in the reporting and analysis of ordinal data in the literature. We hope that adoption of the WMW GenOR will have the dual effect of improving the reporting of data in individual studies while also increasing the inclusivity (and therefore validity) of meta-analyses. 相似文献

17.

Assessment of Lower Limb Muscle Strength and Power Using Hand-Held and Fixed Dynamometry: A Reliability and Validity Study

Benjamin F. Mentiplay Luke G. Perraton Kelly J. Bower Brooke Adair Yong-Hao Pua Gavin P. Williams Rebekah McGaw Ross A. Clark 《PloS one》2015,10(10)

Introduction

Hand-held dynamometry (HHD) has never previously been used to examine isometric muscle power. Rate of force development (RFD) is often used for muscle power assessment, however no consensus currently exists on the most appropriate method of calculation. The aim of this study was to examine the reliability of different algorithms for RFD calculation and to examine the intra-rater, inter-rater, and inter-device reliability of HHD as well as the concurrent validity of HHD for the assessment of isometric lower limb muscle strength and power.

Methods

30 healthy young adults (age: 23±5yrs, male: 15) were assessed on two sessions. Isometric muscle strength and power were measured using peak force and RFD respectively using two HHDs (Lafayette Model-01165 and Hoggan microFET2) and a criterion-reference KinCom dynamometer. Statistical analysis of reliability and validity comprised intraclass correlation coefficients (ICC), Pearson correlations, concordance correlations, standard error of measurement, and minimal detectable change.

Results

Comparison of RFD methods revealed that a peak 200ms moving window algorithm provided optimal reliability results. Intra-rater, inter-rater, and inter-device reliability analysis of peak force and RFD revealed mostly good to excellent reliability (coefficients ≥ 0.70) for all muscle groups. Concurrent validity analysis showed moderate to excellent relationships between HHD and fixed dynamometry for the hip and knee (ICCs ≥ 0.70) for both peak force and RFD, with mostly poor to good results shown for the ankle muscles (ICCs = 0.31–0.79).

Conclusions

Hand-held dynamometry has good to excellent reliability and validity for most measures of isometric lower limb strength and power in a healthy population, particularly for proximal muscle groups. To aid implementation we have created freely available software to extract these variables from data stored on the Lafayette device. Future research should examine the reliability and validity of these variables in clinical populations. 相似文献

18.

Reliability of Trachoma Clinical Grading—Assessing Grading of Marginal Cases

Salman A. Rahman Sun N. Yu Abdou Amza Sintayehu Gebreselassie Boubacar Kadri Nassirou Baido Nicole E. Stoller Joseph P. Sheehan Travis C. Porco Bruce D. Gaynor Jeremy D. Keenan Thomas M. Lietman 《PLoS neglected tropical diseases》2014,8(5)

Background

Clinical examination of trachoma is used to justify intervention in trachoma-endemic regions. Currently, field graders are certified by determining their concordance with experienced graders using the kappa statistic. Unfortunately, trachoma grading can be highly variable and there are cases where even expert graders disagree (borderline/marginal cases). Prior work has shown that inclusion of borderline cases tends to reduce apparent agreement, as measured by kappa. Here, we confirm those results and assess performance of trainees on these borderline cases by calculating their reliability error, a measure derived from the decomposition of the Brier score.

Methods and Findings

We trained 18 field graders using 200 conjunctival photographs from a community-randomized trial in Niger and assessed inter-grader agreement using kappa as well as reliability error. Three experienced graders scored each case for the presence or absence of trachomatous inflammation - follicular (TF) and trachomatous inflammation - intense (TI). A consensus grade for each case was defined as the one given by a majority of experienced graders. We classified cases into a unanimous subset if all 3 experienced graders gave the same grade. For both TF and TI grades, the mean kappa for trainees was higher on the unanimous subset; inclusion of borderline cases reduced apparent agreement by 15.7% for TF and 12.4% for TI. When we assessed the breakdown of the reliability error, we found that our trainees tended to over-call TF grades and under-call TI grades, especially in borderline cases.

Conclusions

The kappa statistic is widely used for certifying trachoma field graders. Exclusion of borderline cases, which even experienced graders disagree on, increases apparent agreement with the kappa statistic. Graders may agree less when exposed to the full spectrum of disease. Reliability error allows for the assessment of these borderline cases and can be used to refine an individual trainee''s grading. 相似文献

19.

The Role of Digital Rectal Examination for Diagnosis of Acute Appendicitis: A Systematic Review and Meta-Analysis

Toshihiko Takada Hiroki Nishiwaki Yosuke Yamamoto Yoshinori Noguchi Shingo Fukuma Shin Yamazaki Shunichi Fukuhara 《PloS one》2015,10(9)

Background

Digital rectal examination (DRE) has been traditionally recommended to evaluate acute appendicitis, although several reports indicate its lack of utility for this diagnosis. No meta-analysis has examined DRE for diagnosis of acute appendicitis.

Objectives

To assess the role of DRE for diagnosis of acute appendicitis.

Data Sources

Cochrane Library, PubMed, and SCOPUS from the earliest available date of indexing through November 23, 2014, with no language restrictions.

Study Selection

Clinical studies assessing DRE as an index test for diagnosis of acute appendicitis.

Data Extraction and Synthesis

Two independent reviewers extracted study data and assessed the quality, using the Quality Assessment of Diagnostic Accuracy Studies 2 tool. Bivariate random-effects models were used for the pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio (DOR) as point estimates with 95% confidence intervals (CI).

Main Outcomes and Measures

The main outcome measure was the diagnostic performance of DRE for diagnosis of acute appendicitis.

Results

We identified 19 studies with a total of 7511 patients. The pooled sensitivity and specificity were 0.49 (95% CI 0.42–0.56) and 0.61 (95% CI 0.53–0.67), respectively. The positive and negative likelihood ratios were 1.24 (95% CI 0.97–1.58) and 0.85 (95% CI 0.70–1.02), respectively. The DOR was 1.46 (0.95–2.26).

Conclusion and Relevance

Acute appendicitis cannot be ruled in or out through the result of DRE. Reconsideration is needed for the traditional teaching that rectal examination should be performed routinely in all patients with suspected appendicitis. 相似文献

20.

Diagnosing Nodular Regenerative Hyperplasia of the Liver Is Thwarted by Low Interobserver Agreement

Bindia Jharap Dirk P. van Asseldonk Nanne K. H. de Boer Pierre Bedossa Joachim Diebold A. Mieke Jonker Emmanuelle Leteurtre Joanne Verheij Dominique Wendum Fritz Wrba Pieter E. Zondervan Jean-Frédéric Colombel Walter Reinisch Chris J. J. Mulder Elisabeth Bloemena Adriaan A. van Bodegraven NRH-pathology Investigators 《PloS one》2015,10(6)

Background and Aims

Nodular regenerative hyperplasia (NRH) of the liver is associated with several diseases and drugs. Clinical symptoms of NRH may vary from absence of symptoms to full-blown (non-cirrhotic) portal hypertension. However, diagnosing NRH is challenging. The objective of this study was to determine inter- and intraobserver agreement on the histopathologic diagnosis of NRH.

Methods

Liver specimens (n=48) previously diagnosed as NRH, were reviewed for the presence of NRH by seven pathologists without prior knowledge of the original diagnosis or clinical background. The majority of the liver specimens were from thiopurine using inflammatory bowel disease patients. Histopathologic features contributing to NRH were also assessed. Criteria for NRH were modified by consensus and subsequently validated. Interobserver agreement was evaluated by using the standard kappa index.

Results

After review, definite NRH, inconclusive NRH and no NRH were found in 35% (23-40%), 21% (13-27%) and 44% (38-56%), respectively (median, IQR). The median interobserver agreement for NRH was poor (κ = 0.20, IQR 0.14-0.28). The intraobserver variability on NRH ranged between 14% and 71%. After modification of the criteria and exclusion of biopsies with technical shortcomings, the interobserver agreement on the diagnosis NRH was fair (κ = 0.45).

Conclusions

The interobserver agreement on the histopathologic diagnosis of NRH was poor, even when assessed by well-experienced liver pathologists. Modification of the criteria of NRH based on consensus effort and exclusion of biopsies of poor quality led to a fairly increased interobserver agreement. The main conclusion of this study is that NRH is a clinicopathologic diagnosis that cannot reliably be based on histopathology alone. 相似文献