首页 | 本学科首页   官方微博 | 高级检索  
 共查询到19条相似文献,搜索用时 0 毫秒
Pepe MS  Cai T  Longton G 《Biometrics》2006,62(1):221-229
No single biomarker for cancer is considered adequately sensitive and specific for cancer screening. It is expected that the results of multiple markers will need to be combined in order to yield adequately accurate classification. Typically, the objective function that is optimized for combining markers is the likelihood function. In this article, we consider an alternative objective function-the area under the empirical receiver operating characteristic curve (AUC). We note that it yields consistent estimates of parameters in a generalized linear model for the risk score but does not require specifying the link function. Like logistic regression, it yields consistent estimation with case-control or cohort data. Simulation studies suggest that AUC-based classification scores have performance comparable with logistic likelihood-based scores when the logistic regression model holds. Analysis of data from a proteomics biomarker study shows that performance can be far superior to logistic regression derived scores when the logistic regression model does not hold. Model fitting by maximizing the AUC rather than the likelihood should be considered when the goal is to derive a marker combination score for classification or prediction.  相似文献   

Aim The area under the receiver operating characteristic (ROC) curve (AUC) is a widely used statistic for assessing the discriminatory capacity of species distribution models. Here, I used simulated data to examine the interdependence of the AUC and classical discrimination measures (sensitivity and specificity) derived for the application of a threshold. I shall further exemplify with simulated data the implications of using the AUC to evaluate potential versus realized distribution models. Innovation After applying the threshold that makes sensitivity and specificity equal, a strong relationship between the AUC and these two measures was found. This result is corroborated with real data. On the other hand, the AUC penalizes the models that estimate potential distributions (the regions where the species could survive and reproduce due to the existence of suitable environmental conditions), and favours those that estimate realized distributions (the regions where the species actually lives). Main conclusions Firstly, the independence of the AUC from the threshold selection may be irrelevant in practice. This result also emphasizes the fact that the AUC assumes nothing about the relative costs of errors of omission and commission. However, in most real situations this premise may not be optimal. Measures derived from a contingency table for different cost ratio scenarios, together with the ROC curve, may be more informative than reporting just a single AUC value. Secondly, the AUC is only truly informative when there are true instances of absence available and the objective is the estimation of the realized distribution. When the potential distribution is the goal of the research, the AUC is not an appropriate performance measure because the weight of commission errors is much lower than that of omission errors.  相似文献   

MOTIVATION: Protein expression profiling for differences indicative of early cancer holds promise for improving diagnostics. Due to their high dimensionality, statistical analysis of proteomic data from mass spectrometers is challenging in many aspects such as dimension reduction, feature subset selection as well as construction of classification rules. Search of an optimal feature subset, commonly known as the feature subset selection (FSS) problem, is an important step towards disease classification/diagnostics with biomarkers. METHODS: We develop a parsimonious threshold-independent feature selection (PTIFS) method based on the concept of area under the curve (AUC) of the receiver operating characteristic (ROC). To reduce computational complexity to a manageable level, we use a sigmoid approximation to the empirical AUC as the criterion function. Starting from an anchor feature, the PTIFS method selects a feature subset through an iterative updating algorithm. Highly correlated features that have similar discriminating power are precluded from being selected simultaneously. The classification rule is then determined from the resulting feature subset. RESULTS: The performance of the proposed approach is investigated by extensive simulation studies, and by applying the method to two mass spectrometry data sets of prostate cancer and of liver cancer. We compare the new approach with the threshold gradient descent regularization (TGDR) method. The results show that our method can achieve comparable performance to that of the TGDR method in terms of disease classification, but with fewer features selected. AVAILABILITY: Supplementary Material and the PTIFS implementations are available at http://staff.ustc.edu.cn/~ynyang/PTIFS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

In the setting of longitudinal study, subjects are followed for the occurrence of some dichotomous outcome. In many of these studies, some markers are also obtained repeatedly during the study period. Emir et al. introduced a non-parametric approach to the estimation of the area under the ROC curve of a repeated marker. Their non-parametric estimate involves assigning a weight to each subject. There are two weighting schemes suggested in their paper: one for the case when within-patient correlation is low, and the other for the case when within-subject correlation is high. However, it is not clear how to assign weights to marker measurements when within-patient correlation is modest. In this paper, we consider the optimal weights that minimize the variance of the estimate of the area under the ROC curve (AUC) of a repeated marker, as well as the optimal weights that minimize the variance of the AUC difference between two repeated markers. Our results in this paper show that the optimal weights depend not only on the within-patient control--case correlation in the longitudinal data, but also on the proportion of subjects that become cases. More importantly, we show that the loss of efficiency by using the two weighting schemes suggested by Emir et al. instead of our optimal weights can be severe when there is a large within-subject control--case correlation and the proportion of subjects that become cases is small, which is often the case in longitudinal study settings.  相似文献   

Evaluation of diagnostic performance is typically based on the receiver operating characteristic (ROC) curve and the area under the curve (AUC) as its summary index. The partial area under the curve (pAUC) is an alternative index focusing on the range of practical/clinical relevance. One of the problems preventing more frequent use of the pAUC is the perceived loss of efficiency in cases of noncrossing ROC curves. In this paper, we investigated statistical properties of comparisons of two correlated pAUCs. We demonstrated that outside of the classic model there are practically reasonable ROC types for which comparisons of noncrossing concave curves would be more powerful when based on a part of the curve rather than the entire curve. We argue that this phenomenon stems in part from the exclusion of noninformative parts of the ROC curves that resemble straight‐lines. We conducted extensive simulation studies in families of binormal, straight‐line, and bigamma ROC curves. We demonstrated that comparison of pAUCs is statistically more powerful than comparison of full AUCs when ROC curves are close to a “straight line”. For less flat binormal ROC curves an increase in the integration range often leads to a disproportional increase in pAUCs’ difference, thereby contributing to an increase in statistical power. Thus, efficiency of differences in pAUCs of noncrossing ROC curves depends on the shape of the curves, and for families of ROC curves that are nearly straight‐line shaped, such as bigamma ROC curves, there are multiple practical scenarios in which comparisons of pAUCs are preferable.  相似文献   

【目的】生态位模型在生物地理学、入侵生物学和保护生物学中具有广泛的应用,被越来越多地用于预测物种潜在分布和现实分布的研究中。本文以美国白蛾为例介绍pROC方案在生态位模型评价中的应用及其注意事项,以期对物种潜在分布预测进行合理的评价,促进生态位模型在我国的合理运用和发展。【方法】介绍ROC曲线和AUC值基本原理,总结其在生态位模型评价中的应用,从物种存在分布点和不存在分布点的可信度出发,分析AUC值用于模型评价的优点和不足,最后介绍局部受试者工作特征曲线的线下面积方案(pROC方案)来弥补传统AUC值的不足。【结果】AUC值虽独立于阈值,但因其综合灵敏度和特异度,而屏蔽这2个指标各自的特征,不能分别评估预测结果的灵敏度和特异度,同时对遗漏率和记账错率不能进行权衡,会误导使用者对模型的评价。与AUC值相比,ROC曲线的形状更具有价值,蕴含丰富的模型评价信息。【结论】模型评价需要将灵敏度和特异度区别对待,ROC曲线形状比AUC值在生态位模型评价中更为重要,pROC方案相对于传统AUC值具有优势,但容易对过度模拟做出不当判断。模型评价与作者研究目的密切相关:当以预测物种潜在分布为目的时(如入侵物种潜在分布、气候变化对物种分布的影响和谱系生物地理学),模型评价应当给予灵敏度(或者遗漏率)更多的权重;当以预测物种现实分布为目的时(如保护区界定和濒危物种引入),模型评价应当给予灵敏度和特异度同等的权重。  相似文献   

While studies have been conducted using human cadaver lumbar spines to understand injury biomechanics in terms of stability/energy to fracture, and physiological responses under pure-moment/follower loads, data are sparse for inferior-to-superior impacts. Injuries occur under this mode from underbody blasts. Objectives: determine role of age, disc area, and trabecular bone density on tolerances/risk curves under vertical loading from a controlled group of specimens. T12-S1 columns were obtained, pretest X-rays and CTs taken, load cells attached to both ends, impacts applied at S1-end using custom vertical accelerator device, and posttest X-ray, CT, and dissections done. BMD of L2-L4 vertebrae were obtained from QCT. Survival analysis-based Human Injury Probability Curves (HIPCs) were derived using proximal and distal forces. Age, area, and BMD were covariates. Forces were considered uncensored, representing the load carrying capacity. The Akaike Information Criterion was used to determine optimal distributions. The mean forces, ±95% confidence intervals, and Normalized Confidence Interval Size (NCIS) were computed. The Lognormal distribution was the optimal function for both forces. Age, area, and BMD were not significant (p > 0.05) covariates for distal forces, while only BMD was significant for proximal forces. The NCIS was the lowest for force-BMD covariate HIPC. The HIPCs for both genders at 35 and 45 years were based on population BMDs. These HIPCs serve as human tolerance criteria for automotive, military, and other applications. In this controlled group of samples, BMD is a better predictor-covariate that characterizes lumbar column injury under inferior-to-superior impacts.  相似文献   

  1. There is growing evidence that prey perceive the risk of predation and alter their behavior in response, resulting in changes in spatial distribution and potential fitness consequences. Previous approaches to mapping predation risk across a landscape quantify predator space use to estimate potential predator‐prey encounters, yet this approach does not account for successful predator attack resulting in prey mortality. An exception is a prey kill site that reflects an encounter resulting in mortality, but obtaining information on kill sites is expensive and requires time to accumulate adequate sample sizes.
  2. We illustrate an alternative approach using predator scat locations and their contents to quantify spatial predation risk for elk (Cervus canadensis) from multiple predators in the Rocky Mountains of Alberta, Canada. We surveyed over 1300 km to detect scats of bears (Ursus arctos/U. americanus), cougars (Puma concolor), coyotes (Canis latrans), and wolves (C. lupus). To derive spatial predation risk, we combined predictions of scat‐based resource selection functions (RSFs) weighted by predator abundance with predictions that a predator‐specific scat in a location contained elk. We evaluated the scat‐based predictions of predation risk by correlating them to predictions based on elk kill sites. We also compared scat‐based predation risk on summer ranges of elk following three migratory tactics for consistency with telemetry‐based metrics of predation risk and cause‐specific mortality of elk.
  3. We found a strong correlation between the scat‐based approach presented here and predation risk predicted by kill sites and (r = .98, p < .001). Elk migrating east of the Ya Ha Tinda winter range were exposed to the highest predation risk from cougars, resident elk summering on the Ya Ha Tinda winter range were exposed to the highest predation risk from wolves and coyotes, and elk migrating west to summer in Banff National Park were exposed to highest risk of encountering bears, but it was less likely to find elk in bear scats than in other areas. These patterns were consistent with previous estimates of spatial risk based on telemetry of collared predators and recent cause‐specific mortality patterns in elk.
  4. A scat‐based approach can provide a cost‐efficient alternative to kill sites of quantifying broad‐scale, spatial patterns in risk of predation for prey particularly in multiple predator species systems.

MALANI  HINA MEHTA 《Biometrika》1995,82(3):515-526
Disease markers are time-dependent covariates which describeprogression towards development of disease. Traditional methodsin survival analysis do not make use of available data on thesemarkers to recover additional information from censored individuals.Using a heuristic modification of the redistribution to theright algorithm (Efron, 1967), a new approach for recoveringinformation for censored individuals using disease markers isproposed. Additionally, the statistical properties of the proposedmethod are examined. There are two possible advantages to thismodification: (i) bias reduction when censoring is informative,and (ii) an increase in efficiency in the case of truly noninformativecensoring.  相似文献   

The Asian longhorned beetle Anoplophora glabripennis (Motschulsky) (Coleoptera: Cerambycidae) is one of the most dangerous xylophagous pests affecting broadleaf trees in the world. Eradication programmes are undertaken in non‐native regions, requiring extensive resources and involving high costs. An adapted strategy must be set up to optimize the ratio cost/probability of success. We developed a method to generate a risk index of A. glabripennis presence at a local scale, in the surrounding area of an infestation, using field observations (counts of adult insects, exit holes and infested trees). The method, mathematically based on the bivariate symmetric Laplace distribution, has thus reasonable input requirements. The output risk map is easy to interpret and can be directly used by decision‐makers. We used our approach in three infestations in Switzerland. The risk map represented well the insect pressure (beetle population density). We highlighted the fact that survey boundaries, commonly chosen using constant distances from the infestation, should be selected regarding the spatial distribution of the insect pressure, to prioritize monitoring activities. The risk map provides a helpful instrument for advanced survey planning after a first overview, for example to decide which area and which host trees should be inspected for infestations.  相似文献   

Legionella species are the causative agents of human legionellosis, and bathing facilities have been identified as the sources of infection in several outbreaks in Japan. Researchers in Japan have recently reported evidence of significant associations between bacterial counts and the occurrence of Legionella in bathing facilities and in a hot tub model. A convenient and quantitative bacterial enumeration method is therefore required as an indicator of Legionella contamination or disinfection to replace existing methods such as time-consuming Legionella culture and expensive Legionella-DNA amplification. In this study, we developed a rapid detection method (RDM) to monitor the risk of Legionella using an automated microbial analyzing device based on flow cytometry techniques to measure the total number of bacteria in water samples within two minutes, by detecting typical patterns of scattered light and fluorescence. We first compared the results of our RDM with plate counting results for five filtered hot spring water samples spiked with three species of bacteria, including Legionella. Inactivation of these samples by chlorine was also assessed by the RDM, a live/dead bacterial fluorescence assay and plate counting. Using the RDM, the lower limit of quantitative bacterial counts in the spiked samples was determined as 3.0 × 103 (3.48 log) counts mL− 1. We then used a laboratory model of a hot tub and found that the RDM could monitor the growth curve of naturally occurring heterotrophic bacteria with 1 and 2 days' delayed growth of amoeba and Legionella, respectively, and could also determine the killing curve of these bacteria by chlorination. Finally, samples with ≥ 3.48 or < 3.48 log total bacterial counts mL− 1 were tested using the RDM from 149 different hot tubs, and were found to be significantly associated with the positive or negative detection of Legionella with 95% sensitivity and 84% specificity. These findings indicated that the RDM can be used for Legionella control at bathing facilities, especially those where the effectiveness of chlorine is reduced by the presence of Fe2+, Mn2+, NH4+, skin debris, and/or biofilms in the water.  相似文献   

Abstract Computerised image analysis was utilised to enumerate the attachment of Staphylococcus epidermidis to HEp2 cell monolayers. A differential staining technique was employed such that individual staphylococcal cells stood out in sharp contrast against the uneven cell surface and granular contents of the epithelial cells. The primary image analysis operation involved subtracting an out-of-focus image from an in-focus image of the bacteria on the monolayer, thereby accentuating the bacterial image. Enumeration, using a particle counting routine, was rapid and reproducible, facilitating counting in excess of 700 bacteria per field at ×500 magnification. The computerised programme compared favourably with manual counting and would provide a rapid, objective and morphologically discriminatory method for evaluating bacterial attachment to various tissues.  相似文献   

Quantification is a major problem when using histology to study the influence of ecological factors on tree structure. This paper presents a method to prepare and to analyse transverse sections of cambial zone and of conductive phloem in bark samples. The following paper (II) presents the automated measurement procedure. Part I here describes and discusses the preparation method, and the influence of tree age on the observed structure. Highly contrasted images of samples extracted at breast height during dormancy were analysed with an automatic image analyser. Between three young (38 years) and three old (147 years) trees, age-related differences were identified by size and shape parameters, at both cell and tissue levels. In the cambial zone, older trees had larger and more rectangular fusiform initials. In the phloem, sieve tubes were also larger, but their shape did not change and the area for sap conduction was similar in both categories. Nevertheless, alterations were limited, and demanded statistical analysis to be identified and ascertained. The physiological implications of the structural changes are discussed.  相似文献   


Cadmium (Cd) phytoremediation potential and its accumulation in edible and nonedible plant tissues is the function of various biochemical processes taking place inside plants. This study assessed the impact of organic ligands on Cd phyto uptake and different biophysiochemical processes of Spinacia oleracea L., and associated health hazards. Plants were exposed to Cd alone and chelated with citric acid (CA) and ethylenediaminetetraacetic acid (EDTA). Results revealed that the effect of Cd on lipid peroxidation, H2O2 production and pigment contents varied greatly with its applied level and the type of organic ligand. Moreover, the effect was more prominent in root tissues than leaf tissues and for high concentrations of Cd and organic ligands. Cadmium accumulation increased by 90 and 74% in roots and leaves, respectively, with increasing Cd levels (25–100?µM). Cadmium exposure at high levels caused lipid peroxidation in roots only. Application of both CA and EDTA slightly diminished Cd toxicity with respect to pigment contents, lipid peroxidation and hydrogen peroxide (H2O2) contents. Hazard quotient (HQ) of Cd was <1.00 for all the treatments. Under nonlinear effect of treatments, multivariate analysis can be an effective tool to trace overall effects/trends.  相似文献   

A simple and highly sensitive stability‐indicating HPLC method was developed and validated for the determination of the new antidepressant agent, agomelatine (AGM). Separation of AGM from its stress‐induced degradation products was achieved on a BDS Hypersil phenyl column (250 mm × 4.6 mm i.d., 5 µm particle size) using methanol–0.05 M phosphate buffer of pH 2.5 (35: 65, v/v) as a mobile phase with fluorescence detection at 230/370 nm. Naproxen was used as an internal standard. The method satisfied all the validation requirements, as evidenced by good linearity (correlation coefficient of 0.9999, over the concentration range 0.4–40.0 ng/mL), accuracy (recovery average 99.55 ± 0.90%), precision (intra‐day RSD 0.54–1.35% and inter‐day RSD 0.93–1.26%), robustness and specificity. The stability of AGM was investigated under different ICH recommended stress conditions including acidic, alkaline, neutral, oxidative and photolytic. AGM was found to be labile to acidic and alkaline degradation and a kinetic study was conducted to explore its degradation behavior. First‐order degradation rate constants and half‐life times were calculated in each case. The proposed method was applied for the determination of AGM in tablets and spiked human plasma with mean percentage recoveries of 99.87 ± 0.31 (n = 3) and 102.09 ± 5.01 (n = 5), respectively. Hence, the proposed method was successfully applied for the determination of AGM in human volunteer plasma. The results were compared statistically with those obtained by a comparison HPLC method revealing no significant differences between the two methods regarding accuracy and precision. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

A simple and sensitive method for linkage analysis is described, which is based on conformation-sensitive gel electrophoresis (CSGE). Using urea-containing agarose gels or a commercially available polyacrylamide-derived matrix, 13 polymorphic markers were newly identified for known genes of the silkworm, Bombyx mori, which had been scored as monomorphic by PCR-RFLP analysis. This method for detecting polymorphisms is quite sensitive, and can be performed with inexpensive reagents and apparatus that is available in most molecular biology laboratories. Received: 19 November 1998 / Accepted: 2 March 1999  相似文献   

Das R  Gerstein M 《Proteins》2004,55(2):455-463
We have introduced a method to identify functional shifts in protein families. Our method is based on the calculation of an active-site conservation ratio, which we call the "ASC ratio." For a structurally based alignment of a protein family, this ratio is the average sequence similarity of the active-site region compared to the full-length protein. The active-site region is defined as all the residues within a certain radius of the known functionally important groups. Using our method, we have analyzed enzymes of central metabolism from a large number of genomes (35). We found that for most of the enzymes, the active-site region is more highly conserved than the full-length sequence. However, for three tricarboxylic acid (TCA)-cycle enzymes, active-site sequences are considerably more diverged (than full-length ones). In particular, we were able to identify in six pathogens a novel isocitrate dehydrogenase that has very low sequence similarity around the active site. Detailed sequence-structure analysis indicates that while the active-site structure of isocitrate dehydrogenase is most likely similar between pathogens and nonpathogens, the unusual sequence divergence could result from an extra domain added at the N-terminus. This domain has a leucine-rich motif similar one in the Yersinia pestis cytotoxin and may therefore confer additional pathogenic functions.  相似文献   

A global metabolic profiling methodology based on gas chromatography coupled to time-of-flight mass spectrometry (GC-TOFMS) for human plasma was applied to a human exercise study focused on the effects of beverages containing glucose, galactose, or fructose taken after exercise and throughout a recovery period of 6 h and 45 min. One group of 10 well trained male cyclists performed 3 experimental sessions on separate days (randomized, single center). After performing a standardized depletion protocol on a bicycle, subjects consumed one of three different beverages: maltodextrin (MD)+glucose (2:1 ratio), MD+galactose (2:1), and MD+fructose (2:1), consumed at an average of ~1.25 g of carbohydrate (CHO) ingested per minute. Blood was taken straight after exercise and every 45 min within the recovery phase. With the resulting blood plasma, insulin, free fatty acid (FFA) profile, glucose, and GC-TOFMS global metabolic profiling measurements were performed. The resulting profiling data was able to match the results obtained from the other clinical measurements with the addition of being able to follow many different metabolites throughout the recovery period. The data quality was assessed, with all the labelled internal standards yielding values of <15% CV for all samples (n=335), apart from the labelled sucrose which gave a value of 15.19%. Differences between recovery treatments including the appearance of galactonic acid from the galactose based beverage were also highlighted.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号