首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Objectives

To evaluate the reliability of semiquantitative Vertebral Fracture Assessment (VFA) on chest Computed Tomography (CT).

Methods

Four observers performed VFA twice upon sagittal reconstructions of 50 routine clinical chest CTs. Intra- and interobserver agreement (absolute agreement or 95% Limits of Agreement) and reliability (Cohen''s kappa or intraclass correlation coefficient(ICC)) were calculated for the visual VFA measures (fracture present, worst fracture grade, cumulative fracture grade on patient level) and for percentage height loss of each fractured vertebra compared to the adjacent vertebrae.

Results

Observers classified 24–38% patients as having at least one vertebral fracture, giving rise to kappa''s of 0.73–0.84 (intraobserver) and 0.56–0.81 (interobserver). For worst fracture grade we found good intraobserver (76–88%) and interobserver (74–88%) agreement, and excellent reliability with square-weighted kappa''s of 0.84–0.90 (intraobserver) and 0.84–0.94 (interobserver). For cumulative fracture grade the 95% Limits of Agreement were maximally ±1,99 (intraobserver) and ±2,69 (interobserver) and the reliability (ICC) varied from 0.84–0.94 (intraobserver) and 0.74–0.94 (interobserver). For percentage height-loss on a vertebral level the 95% Limits of Agreement were maximally ±11,75% (intraobserver) and ±12,53% (interobserver). The ICC was 0.59–0.90 (intraobserver) and 0.53–0–82 (interobserver). Further investigation is needed to evaluate the prognostic value of this approach.

Conclusion

In conclusion, these results demonstrate acceptable reproducibility of VFA on CT.  相似文献   

2.

Background

Fetal heart rate (FHR) variability is an indirect index of fetal autonomic nervous system (ANS) integrity. FHR variability analysis in labor fails to detect early hypoxia and acidemia. Phase-rectified signal averaging (PRSA) is a new method of complex biological signals analysis that is more resistant to non-stationarities, signal loss and artifacts. It quantifies the average cardiac acceleration and deceleration (AC/DC) capacity.

Objective

The aims of the study were: (1) to investigate AC/DC in ovine fetuses exposed to acute hypoxic-acidemic insult; (2) to explore the relation between AC/DC and acid-base balance; and (3) to evaluate the influence of FHR decelerations and specific PRSA parameters on AC/DC computation.

Methods

Repetitive umbilical cord occlusions (UCOs) were applied in 9 pregnant near-term sheep to obtain three phases of MILD, MODERATE, and SEVERE hypoxic-acidemic insult. Acid-base balance was sampled and fetal ECGs continuously recorded. AC/DC were calculated: (1) for a spectrum of T values (T = 1÷50 beats; the parameter limits the range of oscillations detected by PRSA); (2) on entire series of fetal RR intervals or on “stable” series that excluded FHR decelerations caused by UCOs.

Results

AC and DC progressively increased with UCOs phases (MILD vs. MODERATE and MODERATE vs. SEVERE, p<0.05 for DC  = 2–5, and AC  = 1–3). The time evolution of AC/DC correlated to acid-base balance (0.4<<0.9, p<0.05) with the highest for . PRSA was not independent from FHR decelerations caused by UCOs.

Conclusions

This is the first in-vivo evaluation of PRSA on FHR analysis. In the presence of acute hypoxic-acidemia we found increasing values of AC/DC suggesting an activation of ANS. This correlation was strongest on time scale dominated by parasympathetic modulations. We identified the best performing parameters (), and found that AC/DC computation is not independent from FHR decelerations. These findings establish the basis for future clinical studies.  相似文献   

3.

Background

Detecting and quantifying the severity of mitral regurgitation is essential for risk stratification and clinical decision-making regarding timing of surgery. Our objective was to assess specific visual parameters by cine-magnetic resonance imaging (MRI) in the determination of the severity of mitral regurgitation and to compare it to previously validated imaging modalities: echocardiography and cardiac ventriculography.

Methods

The study population consisted of 68 patients who underwent a cardiac MRI followed by an echocardiogram within a median time of 2.0 days and 49 of these patients who had a cardiac catheterization, median time of 2.0 days. The inter-rater agreement statistic (Kappa) was used to evaluate the agreement.

Results

There was moderate agreement between cine MRI and Doppler echocardiography in assessing mitral regurgitation severity, with a kappa value of 0.47, confidence interval (CI) 0.29–0.65. There was also fair agreement between cine MRI and cardiac catheterization with a kappa value of 0.36, CI of 0.17–0.55.

Conclusion

Cine MRI offers a reasonable alternative to both Doppler echocardiography and, to a lesser extent, cardiac catheterization for visually assessing the severity of mitral regurgitation with specific visual parameters during routine clinical cardiac MRI.  相似文献   

4.

Background

Clinical scores of mammographic breast density are highly subjective. Automated technologies for mammography exist to quantify breast density objectively, but the technique that most accurately measures the quantity of breast fibroglandular tissue is not known.

Purpose

To compare the agreement of three automated mammographic techniques for measuring volumetric breast density with a quantitative volumetric MRI-based technique in a screening population.

Materials and Methods

Women were selected from the UCSF Medical Center screening population that had received both a screening MRI and digital mammogram within one year of each other, had Breast Imaging Reporting and Data System (BI-RADS) assessments of normal or benign finding, and no history of breast cancer or surgery. Agreement was assessed of three mammographic techniques (Single-energy X-ray Absorptiometry [SXA], Quantra, and Volpara) with MRI for percent fibroglandular tissue volume, absolute fibroglandular tissue volume, and total breast volume.

Results

Among 99 women, the automated mammographic density techniques were correlated with MRI measures with R2 values ranging from 0.40 (log fibroglandular volume) to 0.91 (total breast volume). Substantial agreement measured by kappa statistic was found between all percent fibroglandular tissue measures (0.72 to 0.63), but only moderate agreement for log fibroglandular volumes. The kappa statistics for all percent density measures were highest in the comparisons of the SXA and MRI results. The largest error source between MRI and the mammography techniques was found to be differences in measures of total breast volume.

Conclusion

Automated volumetric fibroglandular tissue measures from screening digital mammograms were in substantial agreement with MRI and if associated with breast cancer could be used in clinical practice to enhance risk assessment and prevention.  相似文献   

5.

Purpose

To evaluate repeatability and reproducibility of anterior corneal power measurements obtained with a new corneal topographer OphthaTOP (Hummel AG, Germany) and agreement with measurements by a rotating Scheimpflug camera (Pentacam HR, Oculus, Germany) and an automated keratometer (IOLMaster, Carl Zeiss Meditec, Germany).

Methods

The right eyes of 79 healthy subjects were prospectively measured three times with all three devices. Another examiner performed three additional scans with the OphthaTOP in the same session. Within one week, the first examiner repeated the measurements using the OphthaTOP. The flat simulated keratometry (Kf), steep K (Ks), mean K (Km), J0, and J45 were noted. Repeatability and reproducibility of measurements were assessed by within-subject standard deviation (Sw), repeatability (2.77 Sw), coefficient of variation (CoV), and intraclass correlation coefficient (ICC). Agreement between devices was assessed using 95% limits of agreement (LoA).

Results

Intraobserver repeatability and interobserver and intersession reproducibility of all measured parameters showed a 2.77 Sw of 0.29 diopter or less, a CoV of less than 0.24%, and an ICC of more than 0.906. Statistically significant differences (P<0.001) were found between the parameters analyzed by the three devices, except J0 and J45. The mean differences between OphthaTOP and the other two devices were small, and the 95% LoA was narrow for all results.

Conclusions

The OphthaTOP showed excellent intraobserver repeatability and interobserver and intersession reproducibility of corneal power measurements. Good agreements with the other two devices in these parameters were found in healthy eyes.  相似文献   

6.

Background

Incidental CT findings may provide an opportunity for early detection of chronic obstructive pulmonary disease (COPD), which may prove important in CT-based lung cancer screening setting. We aimed to determine the diagnostic performance of human observers to visually evaluate COPD presence on CT images, in comparison to automated evaluation using quantitative CT measures.

Methods

This study was approved by the Dutch Ministry of Health and the institutional review board. All participants provided written informed consent. We studied 266 heavy smokers enrolled in a lung cancer screening trial. All subjects underwent volumetric inspiratory and expiratory chest computed tomography (CT). Pulmonary function testing was used as the reference standard for COPD. We evaluated the diagnostic performance of eight observers and one automated model based on quantitative CT measures.

Results

The prevalence of COPD in the study population was 44% (118/266), of whom 62% (73/118) had mild disease. The diagnostic accuracy was 74.1% in the automated evaluation, and ranged between 58.3% and 74.3% for the visual evaluation of CT images. The positive predictive value was 74.3% in the automated evaluation, and ranged between 52.9% and 74.7% for the visual evaluation. Interobserver variation was substantial, even within the subgroup of experienced observers. Agreement within observers yielded kappa values between 0.28 and 0.68, regardless of the level of expertise. The agreement between the observers and the automated CT model showed kappa values of 0.12–0.35.

Conclusions

Visual evaluation of COPD presence on chest CT images provides at best modest accuracy and is associated with substantial interobserver variation. Automated evaluation of COPD subjects using quantitative CT measures appears superior to visual evaluation by human observers.  相似文献   

7.

Objectives

Left atrial appendage (LAA) dilatation and morphology may influence an individual''s risk for intracardiac thrombi and ischemic stroke. LAA size and morphology can be evaluated using cardiac computed tomography (cCT). The present study evaluated the reproducibility of LAA volume and morphology assessments.

Methods

A total of 149 patients (47 females; mean age 60.9±10.6 years) with suspected cardioembolic stroke/transient ischemic attack underwent cCT. Image quality was rated based on four categories. Ten patients were selected from each image quality category (N = 40) for volumetric reproducibility analysis by two individual readers. LAA and left atrium (LA) volume were measured in both two-chamber (2CV) and transversal view (TV) orientation. Intertechnique reproducibility was assessed between 2CV and TV (200 measurement pairs). LAA morphology (A = Cactus, B = ChickenWing, C = WindSock, D = CauliFlower), LAA opening height, number of LAA lobes, trabeculation, and orientation of the LAA tip was analysed in all study subjects by three individual readers (447 interobserver measurement pairs). The reproducibility of volume measurements was assessed by intra-class correlation (ICC) and the reproducibility of LAA morphology assessments by Cohen''s kappa.

Results

The intra-observer and interobserver reproducibility of LAA and LA volume measurements was excellent (ICCs>0.9). The LAA (ICC = 0.954) and LA (ICC = 0.945) volume measurements were comparable between 2CV and TV. Morphological classification (ĸ = 0.24) and assessments of LAA opening height (ĸ = 0.1), number of LAA lobes (ĸ = 0.16), trabeculation (ĸ = 0.15), and orientation of the LAA tip (ĸ = 0.37) was only slightly to fairly reproducible.

Conclusions

LA and LAA volume measurements on cCT provide excellent reproducibility, whereas visual assessment of LAA morphological features is challenging and results in unsatisfactory agreement between readers.  相似文献   

8.

Background

The role of frozen section (FS) in intraoperative decision making for surgical staging of endometrial cancer is controversial. Objective of this study is to assess the agreement rate between the FS and paraffin section (PS); and the potential impact of the role of FS in the intra-operative decision making for the complete surgical staging in low risk endometrial cancer.

Methods

This is a retrospective analysis of patients diagnosed with intra-operative FS stage I, grade I or II endometrial cancer from 1995–2004. FS results were compared with final pathology results with regard to tumor grade, depth of myometrial invasion, cervical involvement, lymphovascular invasion, and lymph node involvement. Agreement statistic with kappa was calculated using SPSS statistical software. Categorical variables were tested using chi-square test with p value of ≤0.05 being statistically significant.

Results

Of the 457 patients with endometrial cancer, 146 were evaluated by intra-operative FS and met inclusion criteria. FS results were in disagreement with permanent section in 35% for the grade (kappa 0.58, p = 0.003), 28% for depth of myometrial invasion (kappa 0.61, p<0.0001), 13% for cervical involvement (kappa 0.78, p = 0.002), and 32% for lymphovascular invasion (kappa 0.6, p = 0.01). Permanent pathology upstaged 31.9% & 23.2% of FS stage IA, & IB specimen respectively. Lymph node dissection was done in 56.8%. Lymph node metastasis was identified in 8.4%. Use of intraoperative FS would have resulted in suboptimal surgical treatment in 13% stage IA and 6.6% of stage IB patients respectively by foregoing lymphadenectomy.

Conclusion

A significant number of patients with low risk endometrial cancer by FS were upstaged and upgraded on final pathology. Before placing absolute reliance on intraoperative FS to undertake complete surgical staging, the inherent limitation of the same in predicting final stage and grade highlighted by our data need to be carefully considered.  相似文献   

9.

Purpose

Chronic hand and wrist pain is a common clinical issue for orthopaedic surgeons and rheumatologists. The purpose of this study was 1. To analyze the interobserver agreement of SPECT/CT, MRI, CT, bone scan and plain radiographs in patients with non-specific pain of the hand and wrist, and 2. to assess the diagnostic accuracy of these imaging methods in this selected patient population.

Materials and Methods

Thirty-two consecutive patients with non-specific pain of the hand or wrist were evaluated retrospectively. All patients had been imaged by plain radiographs, planar early-phase imaging (bone scan), late-phase imaging (SPECT/CT including bone scan and CT), and MRI. Two experienced and two inexperienced readers analyzed the images with a standardized read-out protocol. Reading criteria were lesion detection and localisation, type and etiology of the underlying pathology. Diagnostic accuracy and interobserver agreement were determined for all readers and imaging modalities.

Results

The most accurate modality for experienced readers was SPECT/CT (accuracy 77%), followed by MRI (56%). The best performing, though little accurate modality for inexperienced readers was also SPECT/CT (44%), followed by MRI and bone scan (38% each). The interobserver agreement of experienced readers was generally high in SPECT/CT concerning lesion detection (kappa 0.93, MRI 0.72), localisation (kappa 0.91, MRI 0.75) and etiology (kappa 0.85, MRI 0.74), while MRI yielded better results on typification of lesions (kappa 0.75, SPECT/CT 0.69). There was poor agreement between experienced and inexperienced readers in SPECT/CT and MRI.

Conclusions

SPECT/CT proved to be the most helpful imaging modality in patients with non-specific wrist pain. The method was found reliable, providing high interobserver agreement, being outperformed by MRI only concerning the typification of lesions. We believe it is beneficial to integrate SPECT/CT into the diagnostic imaging algorithm of chronic wrist pain.  相似文献   

10.

Background

Clinical examination of trachoma is used to justify intervention in trachoma-endemic regions. Currently, field graders are certified by determining their concordance with experienced graders using the kappa statistic. Unfortunately, trachoma grading can be highly variable and there are cases where even expert graders disagree (borderline/marginal cases). Prior work has shown that inclusion of borderline cases tends to reduce apparent agreement, as measured by kappa. Here, we confirm those results and assess performance of trainees on these borderline cases by calculating their reliability error, a measure derived from the decomposition of the Brier score.

Methods and Findings

We trained 18 field graders using 200 conjunctival photographs from a community-randomized trial in Niger and assessed inter-grader agreement using kappa as well as reliability error. Three experienced graders scored each case for the presence or absence of trachomatous inflammation - follicular (TF) and trachomatous inflammation - intense (TI). A consensus grade for each case was defined as the one given by a majority of experienced graders. We classified cases into a unanimous subset if all 3 experienced graders gave the same grade. For both TF and TI grades, the mean kappa for trainees was higher on the unanimous subset; inclusion of borderline cases reduced apparent agreement by 15.7% for TF and 12.4% for TI. When we assessed the breakdown of the reliability error, we found that our trainees tended to over-call TF grades and under-call TI grades, especially in borderline cases.

Conclusions

The kappa statistic is widely used for certifying trachoma field graders. Exclusion of borderline cases, which even experienced graders disagree on, increases apparent agreement with the kappa statistic. Graders may agree less when exposed to the full spectrum of disease. Reliability error allows for the assessment of these borderline cases and can be used to refine an individual trainee''s grading.  相似文献   

11.

Purpose

To compare the reproducibilities of manual and semiautomatic segmentation method for the measurement of normalized cerebral blood volume (nCBV) using dynamic susceptibility contrast-enhanced (DSC) perfusion MR imaging in glioblastomas.

Materials and Methods

Twenty-two patients (11 male, 11 female; 27 tumors) with histologically confirmed glioblastoma (WHO grade IV) were examined with conventional MR imaging and DSC imaging at 3T before surgery or biopsy. Then nCBV (means and standard deviations) in each mass was measured using two DSC MR perfusion analysis methods including manual and semiautomatic segmentation method, in which contrast-enhanced (CE)-T1WI and T2WI were used as structural imaging. Intraobserver and interobserver reproducibility were assessed according to each perfusion analysis method or each structural imaging. Interclass correlation coefficient (ICC), Bland-Altman plot, and coefficient of variation (CV) were used to evaluate reproducibility.

Results

Intraobserver reproducibilities on CE-T1WI and T2WI were ICC of 0.74–0.89 and CV of 20.39–36.83% in manual segmentation method, and ICC of 0.95–0.99 and CV of 8.53–16.19% in semiautomatic segmentation method, repectively. Interobserver reproducibilites on CE-T1WI and T2WI were ICC of 0.86–0.94 and CV of 19.67–35.15% in manual segmentation method, and ICC of 0.74–1.0 and CV of 5.48–49.38% in semiautomatic segmentation method, respectively. Bland-Altman plots showed a good correlation with ICC or CV in each method. The semiautomatic segmentation method showed higher intraobserver and interobserver reproducibilities at CE-T1WI-based study than other methods.

Conclusion

The best reproducibility was found using the semiautomatic segmentation method based on CE-T1WI for structural imaging in the measurement of the nCBV of glioblastomas.  相似文献   

12.

Introduction

To determine the validity and reliability of patients'' self-performed joint counts compared to joint counts by professional assessors in rheumatoid arthritis (RA) patients in different disease activity states.

Methods

In patients with established RA we determined the inter-rater reliability of joint counts performed by an independent evaluator and the patient using intraclass correlation (ICC), and agreement on activity in individual joints by kappa statistics. We also performed longitudinal analyses to assess consistency of assessments over time. Finally, we investigated the concordance of joint counts of different assessors in patients with different levels of disease activity.

Results

The reliability of patient self-performed joint counts was high when compared to independent objective assessment (ICC; 95%confidence interval (CI)) for the assessment of swelling (0.32; 0.15 to 0.46) and tenderness (0.75; 0.66 to 0.81), with higher agreement for larger joints (kappa: 0.57 and 0.45, respectively) compared to smaller joints (metacarpo-phalangeal joint (MCPs): 0.31 and 0.45; and proximal interphalangeal joint (PIPs): 0.22 and 0.47, for swelling and tenderness, respectively).Patients in remission according to the Simplified Disease Activity Index (SDAI ≤ 3.3) showed better concordance of the joint counts (swollen joint count (SJC) ties 25/37, tender joint count (TJC) ties 26/37) compared to moderate/high disease activity states (SDAI > 11; MDA/HDA: SJC ties 9/72, TJC ties 21/72). Positive and negative predictive values regarding the presence of SDAI remission were reasonably good (0.86 and 0.95, respectively). A separate training session for patients did not improve the reliability of joint assessment. The results were consistent in the longitudinal analyses.

Conclusions

Self-performed joint counts are particularly useful for monitoring in patients having attained remission, as these patients seem able to detect state of remission.  相似文献   

13.

Background

Magnetic Resonance Imaging (MRI) is considered the mainstay imaging investigation in patients suspected of lumbar disc herniations. Both imaging and clinical findings determine the final decision of surgery. The objective of this study was to assess MRI observer variation in patients with sciatica who are potential candidates for lumbar disc surgery.

Methods

Patients for this study were potential candidates (n = 395) for lumbar disc surgery who underwent MRI to assess eligibility for a randomized trial. Two neuroradiologists and one neurosurgeon independently evaluated all MRIs. A four point scale was used for both probability of disc herniation and root compression, ranging from definitely present to definitely absent. Multiple characteristics of the degenerated disc herniation were scored. For inter-agreement analysis absolute agreements and kappa coefficients were used. Kappa coefficients were categorized as poor (<0.00), slight (0.00–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80) and excellent (0.81–1.00) agreement.

Results

Excellent agreement was found on the affected disc level (kappa range 0.81–0.86) and the nerve root that most likely caused the sciatic symptoms (kappa range 0.86–0.89). Interobserver agreement was moderate to substantial for the probability of disc herniation (kappa range 0.57–0.77) and the probability of nerve root compression (kappa range 0.42–0.69). Absolute pairwise agreement among the readers ranged from 90–94% regarding the question whether the probability of disc herniation on MRI was above or below 50%. Generally, moderate agreement was observed regarding the characteristics of the symptomatic disc level and of the herniated disc.

Conclusion

The observer variation of MRI interpretation in potential candidates for lumbar disc surgery is satisfactory regarding characteristics most important in decision for surgery. However, there is considerable variation between observers in specific characteristics of the symptomatic disc level and herniated disc.  相似文献   

14.

Background

Severe fetal acidemia during labour with arterial pH below 7.00 is associated with increased risk of hypoxic-ischemic brain injury. Electronic fetal heart rate (FHR) monitoring, the mainstay of intrapartum surveillance, has poor specificity for detecting fetal acidemia. We studied brain electrical activity measured with electrocorticogram (ECOG) in the near term ovine fetus subjected to repetitive umbilical cord occlusions (UCO) inducing FHR decelerations, as might be seen in human labour, to delineate the time-course for ECOG changes with worsening acidemia and thereby assess the potential clinical utility of fetal ECOG.

Methodology/Principal Findings

Ten chronically catheterized fetal sheep were studied through a series of mild, moderate and severe UCO until the arterial pH was below 7.00. At a pH of 7.24±0.04, 52±13 min prior to the pH dropping <7.00, spectral edge frequency (SEF) increased to 23±2 Hz from 3±1 Hz during each FHR deceleration (p<0.001) and was correlated to decreases in FHR and in fetal arterial blood pressure during each FHR deceleration (p<0.001).

Conclusions/Significance

The UCO-related changes in ECOG occurred in advance of the pH decreasing below 7.00. These ECOG changes may be a protective mechanism suppressing non-essential energy needs when oxygen supply to the fetal brain is decreased acutely. By detecting such “adaptive brain shutdown,” the need for delivery in high risk pregnant patients may be more accurately predicted than with FHR monitoring alone. Therefore, monitoring fetal electroencephalogram (EEG, the human equivalent of ECOG) during human labour may be a useful adjunct to FHR monitoring.  相似文献   

15.

Background

Case management guidelines use a limited set of clinical features to guide assessment and treatment for common childhood diseases in poor countries. Using video records of clinical signs we assessed agreement among experts and assessed whether Kenyan health workers could identify signs defined by expert consensus.

Methodology

104 videos representing 11 clinical sign categories were presented to experts using a web questionnaire. Proportionate agreement and agreement beyond chance were calculated using kappa and the AC1 statistic. 31 videos were selected and presented to local health workers, 20 for which experts had demonstrated clear agreement and 11 for which experts could not demonstrate agreement.

Principal Findings

Experts reached very high level of chance adjusted agreement for some videos while for a few videos no agreement beyond chance was found. Where experts agreed Kenyan hospital staff of all cadres recognised signs with high mean sensitivity and specificity (sensitivity: 0.897–0.975, specificity: 0.813–0.894); years of experience, gender and hospital had no influence on mean sensitivity or specificity. Local health workers did not agree on videos where experts had low or no agreement. Results of different agreement statistics for multiple observers, the AC1 and Fleiss'' kappa, differ across the range of proportionate agreement.

Conclusion

Videos provide a useful means to test agreement amongst geographically diverse groups of health workers. Kenyan health workers are in agreement with experts where clinical signs are clear-cut supporting the potential value of assessment and management guidelines. However, clinical signs are not always clear-cut. Video recordings offer one means to help standardise interpretation of clinical signs.  相似文献   

16.

Objective

Although surgical-site infection (SSI) rates are advocated as a major evaluation criterion, the reproducibility of SSI diagnosis is unknown. We assessed agreement in diagnosing SSI among specialists involved in SSI surveillance in Europe.

Methods

Twelve case-vignettes based on suspected SSI were submitted to 100 infection-control physicians (ICPs) and 86 surgeons in 10 European countries. Each participant scored eight randomly-assigned case-vignettes on a secure online relational database. The intra-class correlation coefficient (ICC) was used to assess agreement for SSI diagnosis on a 7-point Likert scale and the kappa coefficient to assess agreement for SSI depth on a three-point scale.

Results

Intra-specialty agreement for SSI diagnosis ranged across countries and specialties from 0.00 (95%CI, 0.00–0.35) to 0.65 (0.45–0.82). Inter-specialty agreement varied from 0.04 (0.00–0.62) in to 0.55 (0.37–0.74) in Germany. For all countries pooled, intra-specialty agreement was poor for surgeons (0.24, 0.14–0.42) and good for ICPs (0.41, 0.28–0.61). Reading SSI definitions improved agreement among ICPs (0.57) but not surgeons (0.09). Intra-specialty agreement for SSI depth ranged across countries and specialties from 0.05 (0.00–0.10) to 0.50 (0.45–0.55) and was not improved by reading SSI definition.

Conclusion

Among ICPs and surgeons evaluating case-vignettes of suspected SSI, considerable disagreement occurred regarding the diagnosis, with variations across specialties and countries.  相似文献   

17.

Background

Clear definitions of outcomes following trichiasis surgery are critical for planning program evaluations and for identifying ways to improve trichiasis surgery. Eyelid contour abnormality is an important adverse outcome of surgery; however, no standard method has been described to categorize eyelid contour abnormalities.

Methodology/Principal Findings

A classification system for eyelid contour abnormalities following surgery for trachomatous trichiasis was developed. To determine whether the grading was reproducible using the classification system, six-week postoperative photographs were reviewed by two senior graders to characterize severity of contour abnormalities. Sample photographs defining each contour abnormality category were compiled and used to train four new graders. All six graders independently graded a Standardization Set of 75 eyelids, which included a roughly equal distribution across the severity scale, and weighted kappa scores were calculated. Two hundred forty six-week postoperative photographs from an ongoing clinical trial were randomly selected for evaluating agreement across graders. Two months after initial grading, one grader regraded a subset of the 240 photographs to measure longer-term intra-observer agreement. The weighted kappa for agreement between the two senior graders was 0.80 (95% CI: 0.71–0.89). Among the Standardization Set, agreement between the senior graders and the 4 new graders showed weighted kappa scores ranging from 0.60–0.80. Among 240 eyes comprising the clinical trial dataset, agreement ranged from weighted kappa 0.70–0.71. Longer-term intra-observer agreement was weighted kappa 0.86 (95% CI: 0.80–0.92).

Conclusions/Significance

The standard eyelid contour grading system we developed reproducibly delineates differing levels of contour abnormality. This grading system could be useful both for helping to evaluate trichiasis surgery outcomes in clinical trials and for evaluating trichiasis surgery programs.  相似文献   

18.

Background

Poor adherence to isoniazid (INH) preventive therapy (IPT) is an impediment to effective control of latent tuberculosis (TB) infection. TB patients who smoke are at higher risk of latent TB infection, active disease, and TB mortality, and may have lower adherence to their TB medications. The objective of our study was to validate IsoScreen and SmokeScreen (GFC Diagnostics, UK), two point-of-care tests for monitoring INH intake and determining smoking status. The tests could be used together in the same individual to help identify patients with a high-risk profile and provide a tailored treatment plan that includes medication management, adherence interventions, and smoking cessation programs.

Methodology/Principal Findings

200 adult outpatients attending the TB and/or the smoking cessation clinic were recruited at the Montreal Chest Institute. Sensitivity and specificity were measured for each test against the corresponding composite reference standard. Test reliability was measured using kappa statistic for intra-rater and inter-rater agreement. Univariate and multivariate logistic regression models were used to explore possible covariates that might be related to false-positive and false-negative test results. IsoScreen had a sensitivity of 93.2% (95% confidence interval [CI] 80.3, 98.2) and specificity of 98.7% (94.8, 99.8). IsoScreen had intra-rater agreement (kappa) of 0.75 (0.48, 0.94) and inter-rater agreement of 0.61 (0.27, 0.90). SmokeScreen had a sensitivity of 69.2% (56.4, 79.8), specificity of 81.6% (73.0, 88.0), intra-rater agreement of 0.77 (0.56, 0.94), and inter-rater agreement of 0.66 (0.42, 0.88). False-positive SmokeScreen tests were strongly associated with INH treatment.

Conclusions

IsoScreen had high validity and reliability, whereas SmokeScreen had modest validity and reliability. SmokeScreen tests did not perform well in a population receiving INH due to the association between INH treatment and false-positive SmokeScreen test results. Development of the next generation SmokeScreen assay should account for this potential interference.  相似文献   

19.

Objective

Interpretation of the EEG background pattern in routine recordings is an important part of clinical reviews. We evaluated the feasibility of an automated analysis system to assist reviewers with evaluation of the general properties in the EEG background pattern.

Methods

Quantitative EEG methods were used to describe the following five background properties: posterior dominant rhythm frequency and reactivity, anterior-posterior gradients, presence of diffuse slow-wave activity and asymmetry. Software running the quantitative methods were given to ten experienced electroencephalographers together with 45 routine EEG recordings and computer-generated reports. Participants were asked to review the EEGs by visual analysis first, and afterwards to compare their findings with the generated reports and correct mistakes made by the system. Corrected reports were returned for comparison.

Results

Using a gold-standard derived from the consensus of reviewers, inter-rater agreement was calculated for all reviewers and for automated interpretation. Automated interpretation together with most participants showed high (kappa > 0.6) agreement with the gold standard. In some cases, automated analysis showed higher agreement with the gold standard than participants. When asked in a questionnaire after the study, all participants considered computer-assisted interpretation to be useful for every day use in routine reviews.

Conclusions

Automated interpretation methods proved to be accurate and were considered to be useful by all participants.

Significance

Computer-assisted interpretation of the EEG background pattern can bring consistency to reviewing and improve efficiency and inter-rater agreement.  相似文献   

20.

Background

Cerebral microbleeds, visible on gradient-recalled echo (GRE) T2* MRI, have generated increasing interest as an imaging marker of small vessel diseases, with relevance for intracerebral bleeding risk or brain dysfunction.

Methodology/Principal Findings

Manual rating methods have limited reliability and are time-consuming. We developed a new method for microbleed detection using automated segmentation (MIDAS) and compared it with a validated visual rating system. In thirty consecutive stroke service patients, standard GRE T2* images were acquired and manually rated for microbleeds by a trained observer. After spatially normalizing each patient''s GRE T2* images into a standard stereotaxic space, the automated microbleed detection algorithm (MIDAS) identified cerebral microbleeds by explicitly incorporating an “extra” tissue class for abnormal voxels within a unified segmentation-normalization model. The agreement between manual and automated methods was assessed using the intraclass correlation coefficient (ICC) and Kappa statistic. We found that MIDAS had generally moderate to good agreement with the manual reference method for the presence of lobar microbleeds (Kappa = 0.43, improved to 0.65 after manual exclusion of obvious artefacts). Agreement for the number of microbleeds was very good for lobar regions: (ICC = 0.71, improved to ICC = 0.87). MIDAS successfully detected all patients with multiple (≥2) lobar microbleeds.

Conclusions/Significance

MIDAS can identify microbleeds on standard MR datasets, and with an additional rapid editing step shows good agreement with a validated visual rating system. MIDAS may be useful in screening for multiple lobar microbleeds.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号