Feature selection based on empirical-risk function to detect lesions in vascular computed tomography期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Feature selection based on empirical-risk function to detect lesions in vascular computed tomography

Institution:	1. Instituto de Bioquímica Médica, Universidade Federal do Rio de Janeiro – UFRJ, Brazil;2. Instituto Nacional de Ciência e Tecnologia de Biologia Estrutural e Bioimagem, Universidade Federal do Rio de Janeiro – UFRJ, Brazil;3. Instituto de Microbiologia Professor Paulo de Góes, Universidade Federal do Rio de Janeiro – UFRJ, Brazil;4. Instituto Federal de Educação, Ciência e Tecnologia do Rio de Janeiro IFRJ, Brazil;5. Instituto Oswaldo Cruz – Fiocruz, Brazil

Abstract:	ObjectiveThe overall goal of the study is to detect coronary artery lesions regardless their nature, calcified or hypo-dense. To avoid explicit modelling of heterogeneous lesions, we adopted an approach based on machine learning and using unsupervised or semi-supervised classifiers. The success of the classifiers based on machine learning strongly depends on the appropriate choice of features differentiating between lesions and regular appearance. The specific goal of this article is to propose a novel strategy devised to select the best feature set for the classifiers used, out of a given set of candidate features.Materials and methodsThe features are calculated in image planes orthogonal to the artery centerline, and the classifier assigns to each of these cross-sections a label “healthy” or “diseased”. The contribution of this article is a feature-selection strategy based on the empirical risk function that is used as a criterion in the initial feature ranking and in the selection process itself. We have assessed this strategy in association with two classifiers based on the density-level detection approach that seeks outliers from the distribution corresponding to the regular appearance. The method was evaluated using a total of 13,687 cross-sections extracted from 53 coronary arteries in 15 patients.ResultsUsing the feature subset selected by the risk-based strategy, balanced error rates achieved by the unsupervised and semi-supervised classifiers respectively were equal to 13.5% and 15.4%. These results were substantially better than the rates achieved using feature subsets selected by supervised strategies. The unsupervised and semi-supervised methods also outperformed supervised classifiers using feature subsets selected by the corresponding supervised strategies.DiscussionSupervised methods require large data sets annotated by experts, both to select the features and to train the classifiers, and collecting these annotations is time-consuming. With these methods, lesions whose appearance differs from the training data may remain undetected. Lesion-detection problem is highly imbalanced, since healthy cross-sections usually are much more numerous than the diseased ones. Training the classifiers based on the density-level detection approach needs a small number of annotations or no annotations at all. The same annotations are sufficient to compute the empirical risk and to perform the selection. Therefore, our strategy associated with an unsupervised or semi-supervised classifier requires a considerably smaller number of annotations as compared to conventional supervised selection strategies. The approach proposed is also better suited for highly imbalanced problems and can detect lesions differing from the training set.ConclusionThe risk-based selection strategy, associated with classifiers using the density-level detection approach, outperformed other strategies and classifiers when used to detect coronary artery lesions. It is well suited for highly imbalanced problems, where the lesions are represented as low-density regions of the feature space, and it can be used in other anomaly detection problems interpretable as a binary classification problem where the empirical risk can be calculated.

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏