首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
One hundred and ten cervical smears were circulated to five specialist consultant cytopathologists and five consultant histopathologists. Of these smears, 100 were randomized and re‐circulated. The cytopathologists reported endocervical cells and wart virus infection more frequently than the histopathologists, although neither group showed good inter‐observer agreement for either assessment. Apart from smear adequacy and the presence of endocervical cells, both groups showed good intra‐observer agreement in all the parameters measured. This suggests that overall individuals were applying their own personal criteria with consistency over time, although a previous study had shown considerable lack of inter‐observer agreement among the histopathologists on the grade of dyskaryosis and the management recommendation. The results indicate that specialist cytopathologists bring a different viewpoint to the reporting of cervical smears than histopathologists. They also show a lack of standardization in the reporting of smears despite the guidelines issued by the British Society for Clinical Cytology.  相似文献   

2.
To assess the importance of variation in observer effort between and within bird atlas projects and demonstrate the use of relatively simple conditional autoregressive (CAR) models for analyzing grid‐based atlas data with varying effort. Pennsylvania and West Virginia, United States of America. We used varying proportions of randomly selected training data to assess whether variations in observer effort can be accounted for using CAR models and whether such models would still be useful for atlases with incomplete data. We then evaluated whether the application of these models influenced our assessment of distribution change between two atlas projects separated by twenty years (Pennsylvania), and tested our modeling methodology on a state bird atlas with incomplete coverage (West Virginia). Conditional Autoregressive models which included observer effort and landscape covariates were able to make robust predictions of species distributions in cases of sparse data coverage. Further, we found that CAR models without landscape covariates performed favorably. These models also account for variation in observer effort between atlas projects and can have a profound effect on the overall assessment of distribution change. Accounting for variation in observer effort in atlas projects is critically important. CAR models provide a useful modeling framework for accounting for variation in observer effort in bird atlas data because they are relatively simple to apply, and quick to run.  相似文献   

3.
A conceptual model describing the response of two Australian floodplain eucalypts, river red gum (Eucalyptus camaldulensis) and black box (Eucalyptus largiflorens), to changes in water availability was developed based on field observations. This model was incorporated into a percentage based visual method estimating two tree crown parameters, crown extent and density. Extent is the amount of foliage at the periphery of the assessable crown; density is the density of assessable crown foliage. Polychoric correlation was used to determine the level of agreement between two experienced observers assessing river red gum and black box trees using a simple percentage scale and a percentage scale supported by the conceptual model. Trees were evaluated using the model by determining their position on a trajectory of water stress related decline and response. In both cases observer estimates of crown extent and density were significantly correlated. With the exception of red gum crown density the correlation coefficients were higher for the model supported scale. Using a conceptual model of tree response to water availability improved observer agreement. Supporting subjective assessment systems with a conceptual model is recommended to improve observer agreement in cases where a distinct model of the dominant stressor can be defined.  相似文献   

4.
5.
At present, various scar assessment scales are available, but not one has been shown to be reliable, consistent, feasible, and valid at the same time. Furthermore, the existing scar assessment scales appear to attach little weight to the opinion of the patient. The newly developed Patient and Observer Scar Assessment Scale consists of two numeric scales: the Patient Scar Assessment Scale (patient scale) and the Observer Scar Assessment Scale (observer scale). The patient and observer scales have to be completed by the patient and the observer, respectively. The patient scale's consistency and the observer scale's consistency, reliability, and feasibility were tested. For the Vancouver Scar Scale, which is the most frequently used scar assessment scale at present, the same statistical measurements were examined and the results of the observer scale and the Vancouver scale were compared. The concurrent validity of the observer scale was tested with a correlation to the Vancouver scale. Furthermore, the authors examined which specific characteristics significantly influence the general opinion of the patient and the observers on the scar areas. Four independent observers have each used the observer scale and the Vancouver scale to assess 49 burn scar areas of 3 x 3 cm belonging to 20 different patients. Subsequently, the patients completed the patient scale for their scar areas. The (internal) consistency of both the patient and the observer scales was acceptable (Cronbach's alpha, 0.76 and 0.69, respectively), whereas the consistency of the Vancouver scale appeared not to be acceptable (alpha, 0.49). The reliability of the observer scale completed by a single observer was acceptable (r = 0.73). The reliability of the Vancouver scale completed by a single observer was lower (r = 0.69). The observer scale showed better agreement than the Vancouver scale because the coefficient of variation was lower (18 percent and 22 percent, respectively). The concurrent validity of the observer scale in relation to the Vancouver scale is high (r = 0.89, p < 0.001). Linear regression of the general opinions on scars of the observer and the patient showed that the observer's opinion is influenced by vascularization, thickness, pigmentation, and relief, whereas the patient's opinion is mainly influenced by itching and the thickness of the scar. Such an impact of itching and thickness of the scar on the patient's opinion is an important and novel finding. The Patient and Observer Scar Assessment Scale offers a suitable, reliable, and complete scar evaluation tool.  相似文献   

6.
PurposeThe purpose of this work was to evaluate the contrast-detail performance of full field digital mammography (FFDM) systems using ideal (Hotelling) observer Signal-to-Noise Ratio (SNR) methodology and ascertain whether it can be considered an alternative to the conventional, automated analysis of CDMAM phantom images.MethodsFive FFDM units currently used in the national breast screening programme were evaluated, which differed with respect to age, detector, Automatic Exposure Control (AEC) and target/filter combination. Contrast-detail performance was analysed using CDMAM and ideal observer SNR methodology. The ideal observer SNR was calculated for input signal originating from gold discs of varying thicknesses and diameters, and then used to estimate the threshold gold thickness for each diameter as per CDMAM analysis. The variability of both methods and the dependence of CDMAM analysis on phantom manufacturing discrepancies also investigated.ResultsResults from both CDMAM and ideal observer methodologies were informative differentiators of FFDM systems' contrast-detail performance, displaying comparable patterns with respect to the FFDM systems' type and age. CDMAM results suggested higher threshold gold thickness values compared with the ideal observer methodology, especially for small-diameter details, which can be attributed to the behaviour of the CDMAM phantom used in this study. In addition, ideal observer methodology results showed lower variability than CDMAM results.ConclusionThe Ideal observer SNR methodology can provide a useful metric of the FFDM systems' contrast detail characteristics and could be considered a surrogate for conventional, automated analysis of CDMAM images.  相似文献   

7.
Photo analysis offers a simple, noninvasive approach to characterizing and quantifying skin lesions in cetaceans; however, this process involves methodological considerations that have often gone unaddressed or have varied in approach among investigators. Subjectivity associated with classifying skin lesion types of unknown etiology and quantifying measures of skin lesion prevalence and extent from photo data raises questions about observer bias and agreement (i.e., interrater reliability), which are often ignored. The purpose of the present study was to improve upon data quality control and assessment practices when studying skin lesions using only photo data. Specifically, we tested interrater reliability of a skin lesion classification system, compared methods of quantifying skin lesion extent, and determined the validity of the dorsal fin as a proxy for skin lesions on the entire body. Acceptable levels of interrater reliability were achieved for only 7 of 17 defined lesion types, but reliability was high for the two tested measures of lesion extent. Skin lesion extent measured from the dorsal fin alone was not a decent proxy for the whole visible surface; disparities between measures were as high as 43%. We discuss the potential pitfalls discovered and provide recommendations for others attempting similar approaches.  相似文献   

8.
Summary For decades the floodplain forests of the River Murray have endured the effects of prolonged water stress. This has resulted in significant crown dieback and loss of condition. The Living Murray (TLM) initiative aims to restore the ecological health of six Icon Sites along the River. The two eucalypts River Red Gum (Eucalyptus camaldulensis) and Black Box (Eucalyptus largiflorens) that dominate the forests at five of the six Icon Sites are undergoing widespread decline. To enable effective management and restoration of these forests, we developed a standardised tree condition assessment method. Named the TLM tree condition assessment method, it utilises visual assessment of a range of tree crown variables (extent and density of the foliage in the crown, epicormic growth, new tip growth, reproductive activity, leaf die‐off, mistletoe infestation) and measurements of bark condition, diameter at breast height and dominance class. This article describes the TLM tree condition assessment method and assesses it for consistency between multiple observer teams after limited training. The level of observer agreement between six teams each comprised of two observers was assessed for seven of the ten variables. Intra‐class correlation was used to compare scores of 30 River Red Gum trees assessed on Gunbower Island on the River Murray. The level of agreement for all variables was statistically significant with six of seven variables having correlation coefficients over R = 0.5. The TLM tree condition assessment method was found to provide accurate estimates of a range of tree variables that can be used to determine tree condition. The TLM tree condition assessment method provides a valuable monitoring tool that can be used to assess management interventions, such as management flooding and silvicultural thinning.  相似文献   

9.

Background

Assessing the quality of care provided by individual health practitioners is critical to identifying possible risks to the health of the public. However, existing assessment methods can be inaccurate, expensive, or infeasible in many developing country settings, particularly in rural areas and especially for children. Following an assessment of the strengths and weaknesses of the existing methods for provider assessment, we developed a synthesis method combining components of direct observation, clinical vignettes, and medical mannequins which we have termed “Observed Simulated Patient” or OSP. An OSP assessment involves a trained actor playing the role of a ‘mother’, a life-size doll representing a 5-year old boy, and a trained observer. The provider being assessed was informed in advance of the role-playing, and told to conduct the diagnosis and treatment as he normally would while verbally describing the examinations.

Methodology/Principal Findings

We tested the validity of OSP by conducting parallel scoring of medical providers in Myanmar, assessing the quality of their diagnosis and treatment of pediatric malaria, first by direct observation of true patients and second by OSP. Data were collected from 20 private independent medical practitioners in Mon and Kayin States, Myanmar between December 26, 2010 and January 12, 2011. All areas of assessment showed agreement between OSP and direct observation above 90% except for history taking related to past experience with malaria medicines. In this area, providers did not ask questions of the OSP to the same degree that they questioned real patients (agreement 82.8%).

Conclusions/Significance

The OSP methodology may provide a valuable option for quality assessment of providers in places, or for health conditions, where other assessment tools are unworkable.  相似文献   

10.
As the recognition of the importance of biological diversity in biological conservation grows, an ongoing challenge is to develop metrics that can be used for effective conservation and management. The ecological integrity assessment has been proposed as such a metric. It is held by some to measure species composition, diversity, and habitat quality, as well as ecosystem structure, composition, and function. The methodology relies on proxy variables that include data on landscape characteristics such as patch size, abiotic factors such as hydrology, and some features of vegetation structure and composition. We suggest that the measure is flawed on four levels. First, its putative representation of general ecological form and function, and its lack of specific detail about how it actually represents those attributes, leaves the metric without the focus needed to be useful for measuring ecological features on the ground and testing associated hypotheses and predictions. Second, the proxy variables used to represent biological diversity, such as habitat (vegetation) metrics and vascular plant species diversity, are not empirically correlated with diversity of a range of taxa or of other components of the biota. Third, like other ecological indices that integrate many distinct features, the ecological integrity index is subject to the loss of information in its condensation of multi-dimensional variability into a one-dimensional index, and it may be subject to systematic bias from the conversion of raw data into categorical scores. Fourth, the sampling protocols are at risk of sampling bias, observer bias, and measurement error, any of which can confound the estimation of conservation value. In terms of biological diversity, the methodology produces an unreliable estimate of the number of vascular plant species and their relative percentages of occurrence, and an absence of any protocols for taxa other than plants. For these reasons we believe that ecological integrity assessment is currently of limited value as a measure of site-specific biological diversity and its change over time. A considerable amount of investigation is needed in order to have confidence in the results of an ecological integrity assessment, especially if it is to be used for regulatory purposes. We suggest further refinements and discuss alternative measures of biological diversity that provide reliable metrics for assessing change. A thoughtful choice among measures can help to identify the most appropriate assessment for conservation decisions.  相似文献   

11.
12.
The concordance correlation coefficient (CCC) and the probability of agreement (PA) are two frequently used measures for evaluating the degree of agreement between measurements generated by two different methods. In this paper, we consider the CCC and the PA using the bivariate normal distribution for modeling the observations obtained by two measurement methods. The main aim of this paper is to develop diagnostic tools for the detection of those observations that are influential on the maximum likelihood estimators of the CCC and the PA using the local influence methodology but not based on the likelihood displacement. Thus, we derive first‐ and second‐order measures considering the case‐weight perturbation scheme. The proposed methodology is illustrated through a Monte Carlo simulation study and using a dataset from a clinical study on transient sleep disorder. Empirical results suggest that under certain circumstances first‐order local influence measures may be more powerful than second‐order measures for the detection of influential observations.  相似文献   

13.
Noninvasive assessment of implant capsules   总被引:2,自引:0,他引:2  
The assessment of implant capsular contracture has been imprecise and vulnerable to observer bias. Attempts to measure capsules with instruments that measure implant deformability are influenced by surrounding breast tissue, subcutaneous fat, and skin. Xeromammography, B-mode ultrasound, and CT were employed in an effort to provide a noninvasive and accurate method of capsule assessment. Through two study phases, implants were placed bilaterally in a total of 21 rabbits. At 4 months, animals underwent radiologic assessment and were then sacrificed for direct implant capsule measurements. Mammographic measurements, more than ultrasound-derived measurements, strongly correlated with laboratory measures of capsular dimensions and deformability. Cross-table lateral mammographic views were more informative than traditional views, providing measures of diameter and height that both strongly correlated with laboratory measurements. CT is theoretically the most accurate method to assess contracture, but it is impractical because of expense and time requirements. The results indicate that radiologic assessment, in particular by xeromammography, of implant capsules is accurate, practical, and noninvasive. Mammography strongly correlates with laboratory measures of implant capsular contracture and therefore could be used in the clinical setting to assess capsular contracture.  相似文献   

14.
The determination and the assessment of Best Available Techniques (BAT) is one of the key issues in the realisation of the IPPC-Directive. While research has already focused on environmental benefits and technical practicability of techniques within LCA, little work has been carried out assessing economic feasibility. A methodology for the economic assessment of BAT in the framework of the IPPC-Directive on a plant level has to comprise all costs that accrue by measures to prevent, to reduce, to utilise or to remove emissions into water, air and soil caused by industrial production processes. The applied cost concept provides a systematic accounting and allocation of decision relevant costs and possibly revenues, that are pertinent to the economic assessment of BAT. The application of the methodology to a case study from the steel industry shows the practical use of the approach.  相似文献   

15.
The goal of this work is the monitoring of the corresponding species in a class of predator–prey systems, this issue is important from the ecology point of view to analyze the population dynamics. The above is done via a nonlinear observer design which contains on its structure a high order polynomial form of the estimation error. A theoretical frame is provided in order to show the convergence characteristics of the proposed observer, where it can be concluded that the performance of the observer is improved as the order of the polynomial is high. The proposed methodology is applied to a class of Lotka–Volterra systems with two and three species. Finally, numerical simulations present the performance of the proposed observer.  相似文献   

16.
This study used a mixed methods methodology to investigate the reliability and validity of the Ounce Scale, an authentic, observational assessment of infants' and toddlers' development from birth through 42 months of age. Quantitative cross-sectional data were collected from 287 children and 124 teachers in seven urban Early Head Start programs; qualitative data were derived from interviews with 21 teachers and seven supervisors. Data were collected across eight age groups. Results showed moderate reliability of the Ounce Scale and provided evidence of agreement with criterion measures for concurrent validity. Receiver operating characteristic curve (ROC) analyses demonstrated very good levels of accuracy in predicting which children were at-risk or not at-risk. Hierarchical regression analyses indicated that, after controlling for child and family variables, the Ounce Scale contributed significantly to explaining the variance in children's performance on the criterion measures. Analysis of qualitative interview data elaborates on these findings in terms of the strength-based philosophy of the caregivers, the binary structure of the scale, the cultural context in which the scale was used, and the need for additional professional development. Discussion also centers on the relationship between norm-referenced and performance-based assessments in early childhood.  相似文献   

17.

Background

Peer review of grant applications has been criticized as lacking reliability. Studies showing poor agreement among reviewers supported this possibility but usually focused on reviewers’ scores and failed to investigate reasons for disagreement. Here, our goal was to determine how reviewers rate applications, by investigating reviewer practices and grant assessment criteria.

Methods and Findings

We first collected and analyzed a convenience sample of French and international calls for proposals and assessment guidelines, from which we created an overall typology of assessment criteria comprising nine domains relevance to the call for proposals, usefulness, originality, innovativeness, methodology, feasibility, funding, ethical aspects, and writing of the grant application. We then performed a qualitative study of reviewer practices, particularly regarding the use of assessment criteria, among reviewers of the French Academic Hospital Research Grant Agencies (Programmes Hospitaliers de Recherche Clinique, PHRCs). Semi-structured interviews and observation sessions were conducted. Both the time spent assessing each grant application and the assessment methods varied across reviewers. The assessment criteria recommended by the PHRCs were listed by all reviewers as frequently evaluated and useful. However, use of the PHRC criteria was subjective and varied across reviewers. Some reviewers gave the same weight to each assessment criterion, whereas others considered originality to be the most important criterion (12/34), followed by methodology (10/34) and feasibility (4/34). Conceivably, this variability might adversely affect the reliability of the review process, and studies evaluating this hypothesis would be of interest.

Conclusions

Variability across reviewers may result in mistrust among grant applicants about the review process. Consequently, ensuring transparency is of the utmost importance. Consistency in the review process could also be improved by providing common definitions for each assessment criterion and uniform requirements for grant application submissions. Further research is needed to assess the feasibility and acceptability of these measures.  相似文献   

18.
The aim of this paper was to validate an alternative multi-criteria evaluation system to assess animal welfare on farms based on the Welfare Quality® (WQ) project, using an example of welfare assessment of growing pigs. This alternative methodology aimed to be more transparent for stakeholders and more flexible than the methodology proposed by WQ. The WQ assessment protocol for growing pigs was implemented to collect data in different farms in Schleswig-Holstein, Germany. In total, 44 observations were carried out. The aggregation system proposed in the WQ protocol follows a three-step aggregation process. Measures are aggregated into criteria, criteria into principles and principles into an overall assessment. This study focussed on the first two steps of the aggregation. Multi-attribute utility theory (MAUT) was used to produce a value of welfare for each criterion and principle. The utility functions and the aggregation function were constructed in two separated steps. The MACBETH (Measuring Attractiveness by a Categorical-Based Evaluation Technique) method was used for utility function determination and the Choquet integral (CI) was used as an aggregation operator. The WQ decision-makers’ preferences were fitted in order to construct the utility functions and to determine the CI parameters. The validation of the MAUT model was divided into two steps, first, the results of the model were compared with the results of the WQ project at criteria and principle level, and second, a sensitivity analysis of our model was carried out to demonstrate the relative importance of welfare measures in the different steps of the multi-criteria aggregation process. Using the MAUT, similar results were obtained to those obtained when applying the WQ protocol aggregation methods, both at criteria and principle level. Thus, this model could be implemented to produce an overall assessment of animal welfare in the context of the WQ protocol for growing pigs. Furthermore, this methodology could also be used as a framework in order to produce an overall assessment of welfare for other livestock species. Two main findings are obtained from the sensitivity analysis, first, a limited number of measures had a strong influence on improving or worsening the level of welfare at criteria level and second, the MAUT model was not very sensitive to an improvement in or a worsening of single welfare measures at principle level. The use of weighted sums and the conversion of disease measures into ordinal scores should be reconsidered.  相似文献   

19.
Humans differ in how they perceive, assess, and measure animal behaviour. This is problematic because strong observer bias can reduce statistical power, accuracy of scientific inference, and in the worst cases, lead to spurious results. Unfortunately, reports and studies of measurement reliability in animal behaviour studies are rare. Here, we investigated two aspects of measurement reliability in working dogs: inter‐observer agreement and criterion validity (comparing novice ratings with those given by experts). Here, we extend for the first time a powerful framework used in human psychological studies to investigate three potential aspects of (dis)agreement in nonhuman animal behaviour research: (a) that some behaviours are easier to observe than others; (b) that some subjects are easier to observe than others; and (c) that observers with different levels of experience with the subject animal give the same or different ratings. We found that novice observers with the same level of experience agreed upon measures of a wide range of behaviours. We found no evidence that age of the dogs affected agreement between these same novice observers. However, when observers with different levels of experience (i.e., novices vs. a working dog expert) assessed the same dogs, agreement appeared to be strongly affected by the measurement instrument used to assess behaviour. Given that animal behaviour research often utilizes different observers with different levels of experience, our results suggest that further tests of how different observers may measure behaviour in different ways are needed across a wider variety of organisms and measurement instruments.  相似文献   

20.
Gait assessment in dairy cattle   总被引:1,自引:0,他引:1  
Lameness is one of the most important dairy cow welfare issues and has inspired a growing body of literature on gait assessment. Validation studies have shown that several methods of gait assessment are able to successfully distinguish cows with and without painful pathologies. While subjective methods provide an immediate, on-site assessment and require no technical equipment, they show variation in observer reliability. On the other hand, objective methods of gait assessment provide accurate and reliable data, but typically require sophisticated technology, limiting their use on farms. In this critical review, we evaluate gait assessment methods, discuss the reliability and validity of measures used to date, and point to areas where new research is needed. We show how gait can be affected by hoof and leg pathologies, treatment of these ailments and the pain associated with lameness. We also discuss how cow (e.g. conformation, size and udder fill) and environmental features (e.g. flooring) contribute to variation in the way cows walk. An understanding of all these factors is important to avoid misclassifying of cows and confounding comparisons between herds.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号