首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Fuzzy decision trees are powerful, top-down, hierarchical search methodology to extract human interpretable classification rules. However, they are often criticized to result in poor learning accuracy. In this paper, we propose Neuro-Fuzzy Decision Trees (N-FDTs); a fuzzy decision tree structure with neural like parameter adaptation strategy. In the forward cycle, we construct fuzzy decision trees using any of the standard induction algorithms like fuzzy ID3. In the feedback cycle, parameters of fuzzy decision trees have been adapted using stochastic gradient descent algorithm by traversing back from leaf to root nodes. With this strategy, during the parameter adaptation stage, we keep the hierarchical structure of fuzzy decision trees intact. The proposed approach of applying backpropagation algorithm directly on the structure of fuzzy decision trees improves its learning accuracy without compromising the comprehensibility (interpretability). The proposed methodology has been validated using computational experiments on real-world datasets.  相似文献   

2.
One symbolic (rule-based inductive learning) and one connectionist (neural network) machine learning technique were used to reconstruct muscle activation patterns from kinematic data measured during normal human walking at several speeds. The activation patterns (or desired outputs) consisted of surface electromyographic (EMG) signals from the semitendinosus and vastus medialis muscles. The inputs consisted of flexion and extension angles measured at the hip and knee of the ipsilateral leg, their first and second derivatives, and bilateral foot contact information. The training set consisted of data from six trials, at two different speeds. The testing set consisted of data from two additional trials (one at each speed), which were not in the training set. It was possible to reconstruct the muscular activation at both speeds using both techniques. Timing of the reconstructed signals was accurate. The integrated value of the activation bursts was less accurate. The neural network gave a continuous output, whereas the rule-based inductive learning rule tree gave a quantised activation level. The advantage of rule-based inductive learning was that the rules used were both explicit and comprehensible, whilst the rules used by the neural network were implicit within its structure and not easily comprehended. The neural network was able to reconstruct the activation patterns of both muscles from one network, whereas two separate rule sets were needed for the rule-based technique. It is concluded that machine learning techniques, in comparison to explicit inverse muscular skeletal models, show good promise in modelling nearly cyclic movements such as locomotion at varying walking speeds. However, they do not provide insight into the biomechanics of the system, because they are not based on the biomechanical structure of the system.  相似文献   

3.
This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice.  相似文献   

4.
The use of Inertial Measurement Units (IMUs) for spatial gait analysis has opened the door to unconstrained measurements within the home and community. Bandwidth, cost limitations, and ease of use has historically restricted the number and location of sensors worn on the body. In this paper, we describe a four-sensor configuration of IMUs placed on the shanks and thighs that is sufficient to provide an accurate measure of temporal gait parameters, spatial gait parameters, and joint angle dynamics during ambulation. Estimating spatial gait parameters solely from gyroscope data is preferred because gyroscopes are less susceptible to sensor noise and a system comprised of only gyroscopes uses decreased bandwidth compared to a typical 9 degree-of-freedom IMU. The purpose of this study was to determine the validity of a novel method of step length estimation using gyroscopes attached to the shanks and thighs. An Inverted Pendulum Model algorithm (IPM) was proposed to calculate step length, stride length, and gait speed. The algorithm incorporates heel-strike events and average forward velocity per step to make these assessments. IMU algorithm accuracy was determined via concurrent validity with an instrumented walkway and results explained via the collision model of gait. The IPM produced accurate estimates of step length, stride length, and gait speed with a mean difference of 3 cm and an RMSE of 6.6 cm for step length, thus establishing a new approach for spatial gait parameter calculation. The lack of numerical integration in IPM makes it well suited for use in continuous monitoring applications where sensor sampling rates are restricted.  相似文献   

5.
The use of a linguistic representation for expressing knowledge acquired by learning systems is an important issue as regards to user understanding. Under this assumption, and to make sure that these systems will be welcome and used, several techniques have been developed by the artificial intelligence community, under both the symbolic and the connectionist approaches. This work discusses and investigates three knowledge extraction techniques based on these approaches. The first two techniques, the C4.5 and CN2 symbolic learning algorithms, extract knowledge directly from the data set. The last technique, the TREPAN algorithm extracts knowledge from a previously trained neural network. The CN2 algorithm induces if...then rules from a given data set. The C4.5 algorithm extracts decision trees, although it can also extract ordered rules, from the data set. Decision trees are also the knowledge representation used by the TREPAN algorithm.  相似文献   

6.
This paper addresses the question of maximizing classifier accuracy for classifying task-related mental activity from Magnetoencelophalography (MEG) data. We propose the use of different sources of information and introduce an automatic channel selection procedure. To determine an informative set of channels, our approach combines a variety of machine learning algorithms: feature subset selection methods, classifiers based on regularized logistic regression, information fusion, and multiobjective optimization based on probabilistic modeling of the search space. The experimental results show that our proposal is able to improve classification accuracy compared to approaches whose classifiers use only one type of MEG information or for which the set of channels is fixed a priori.  相似文献   

7.
One of the major research directions in bioinformatics is that of predicting the protein superfamily in large databases and classifying a given set of protein domains into superfamilies. The classification reflects the structural, evolutionary and functional relatedness. These relationships are embodied in hierarchical classification such as Structural Classification of Protein (SCOP), which is manually curated. Such classification is essential for the structural and functional analysis of proteins. Yet, a large number of proteins remain unclassified. We have proposed an unsupervised machine-learning FuzzyART neural network algorithm to classify a given set of proteins into SCOP superfamilies. The proposed method is fast learning and uses an atypical non-linear pattern recognition technique. In this approach, we have constructed a similarity matrix from p-values of BLAST all-against-all, trained the network with FuzzyART unsupervised learning algorithm using the similarity matrix as input vectors and finally the trained network offers SCOP superfamily level classification. In this experiment, we have evaluated the performance of our method with existing techniques on six different datasets. We have shown that the trained network is able to classify a given similarity matrix of a set of sequences into SCOP superfamilies at high classification accuracy.  相似文献   

8.
With the growing uncertainty and complexity in the manufacturing environment, most scheduling problems have been proven to be NP-complete and this can degrade the performance of conventional operations research (OR) techniques. This article presents a system-attribute-oriented knowledge-based scheduling system (SAOSS) with inductive learning capability. With the rich heritage from artificial intelligence (AI), SAOSS takes a multialgorithm paradigm which makes it more intelligent, flexible, and suitable than others for tackling complicated, dynamic scheduling problems. SAOSS employs an efficient and effective inductive learning method, a continuous iterative dichotomister 3 (CID3) algorithm, to induce decision rules for scheduling by converting corresponding decision trees into hidden layers of a self-generated neural network. Connection weights between hidden units imply the scheduling heuristics, which are then formulated into scheduling rules. An FMS scheduling problem is also given for illustration. The scheduling results show that the system-attribute-oriented knowledge-based approach is capable of addressing dynamic scheduling problems.  相似文献   

9.
Classification, which is the task of assigning objects to one of several predefined categories, is a pervasive problem that encompasses many diverse applications. Decision tree classifier, which is a simple yet widely used classification technique, employs training data to yield decision rules; moreover, it can create thresholds and then split the list of continuous attributes into descrete intervals for handling continuous attributes (Quinlan in Journal of Artificial Intelligence Research 4:77–90, 1996). Rough set theory (Pawlak in International Journal of Computer and Information Sciences 11:341–356, 1982; International Journal of Man-Machine Studies 20:469–483, 1984; Rough sets: theoretical aspects of reasoning about data. Kluwer, Dordrecht, 1991) has been applied to a wide variety of decision analysis problems for the extraction of rules from databases. This paper proposes a hybrid approach that takes advantage of combining decision tree and rough sets classifier and applies it to plant classification. The introduced approach starts with decision tree classifier (C4.5) as preprocessing technique to make interval-discretization, subsequently, and uses rough set method for extracting rules. The proposed approach aims at finding out classification rules via analyzing lamina attributes (leaf stalk, leaf width, leaf length, length/width ratio) of Cinnamomum, which are gathered and measured by plant specialists in the field of Taiwan. A comparison with the widely used algorithms (e.g., decision tree, multilayer perceptrons, naïve Bayes, and rough sets classifier) is carried out to show numerous advantages of the proposed approach. Finally, employing with test data in which species are unknown, results of classification are approved by consulting the relative plant specialists.  相似文献   

10.
Machine learning of functional class from phenotype data   总被引:5,自引:0,他引:5  
MOTIVATION: Mutant phenotype growth experiments are an important novel source of functional genomics data which have received little attention in bioinformatics. We applied supervised machine learning to the problem of using phenotype data to predict the functional class of Open Reading Frames (ORFs) in Saccaromyces cerevisiae. Three sources of data were used: TRansposon-Insertion Phenotypes, Localization and Expression in Saccharomyces (TRIPLES), European Functional Analysis Network (EUROFAN) and Munich Information Center for Protein Sequences (MIPS). The analysis of the data presented a number of challenges to machine learning: multi-class labels, a large number of sparsely populated classes, the need to learn a set of accurate rules (not a complete classification), and a very large amount of missing values. We modified the algorithm C4.5 to deal with these problems. RESULTS: Rules were learnt which are accurate and biologically meaningful. The rules predict function of 83 ORFs of unknown function at an estimated accuracy of > or = 80%.  相似文献   

11.
Interactive semisupervised learning for microarray analysis   总被引:3,自引:0,他引:3  
Microarray technology has generated vast amounts of gene expression data with distinct patterns. Based on the premise that genes of correlated functions tend to exhibit similar expression patterns, various machine learning methods have been applied to capture these specific patterns in microarray data. However, the discrepancy between the rich expression profiles and the limited knowledge of gene functions has been a major hurdle to the understanding of cellular networks. To bridge this gap so as to properly comprehend and interpret expression data, we introduce relevance feedback to microarray analysis and propose an interactive learning framework to incorporate the expert knowledge into the decision module. In order to find a good learning method and solve two intrinsic problems in microarray data, high dimensionality and small sample size, we also propose a semisupervised learning algorithm: kernel discriminant-EM (KDEM). This algorithm efficiently utilizes a large set of unlabeled data to compensate for the insufficiency of a small set of labeled data and it extends the linear algorithm in discriminant-EM (DEM) to a kernel algorithm to handle nonlinearly separable data in a lower dimensional space. The relevance feedback technique and KDEM together construct an efficient and effective interactive semisupervised learning framework for microarray analysis. Extensive experiments on the yeast cell cycle regulation data set and Plasmodium falciparum red blood cell cycle data set show the promise of this approach  相似文献   

12.
Considering the two-class classification problem in brain imaging data analysis, we propose a sparse representation-based multi-variate pattern analysis (MVPA) algorithm to localize brain activation patterns corresponding to different stimulus classes/brain states respectively. Feature selection can be modeled as a sparse representation (or sparse regression) problem. Such technique has been successfully applied to voxel selection in fMRI data analysis. However, single selection based on sparse representation or other methods is prone to obtain a subset of the most informative features rather than all. Herein, our proposed algorithm recursively eliminates informative features selected by a sparse regression method until the decoding accuracy based on the remaining features drops to a threshold close to chance level. In this way, the resultant feature set including all the identified features is expected to involve all the informative features for discrimination. According to the signs of the sparse regression weights, these selected features are separated into two sets corresponding to two stimulus classes/brain states. Next, in order to remove irrelevant/noisy features in the two selected feature sets, we perform a nonparametric permutation test at the individual subject level or the group level. In data analysis, we verified our algorithm with a toy data set and an intrinsic signal optical imaging data set. The results show that our algorithm has accurately localized two class-related patterns. As an application example, we used our algorithm on a functional magnetic resonance imaging (fMRI) data set. Two sets of informative voxels, corresponding to two semantic categories (i.e., “old people” and “young people”), respectively, are obtained in the human brain.  相似文献   

13.
Wearable sensors have potential for quantitative, gait-based, point-of-care fall risk assessment that can be easily and quickly implemented in clinical-care and older-adult living environments. This investigation generated models for wearable-sensor based fall-risk classification in older adults and identified the optimal sensor type, location, combination, and modelling method; for walking with and without a cognitive load task. A convenience sample of 100 older individuals (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m under single-task and dual-task conditions while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, and left and right shanks. Participants also completed the Activities-specific Balance Confidence scale, Community Health Activities Model Program for Seniors questionnaire, six minute walk test, and ranked their fear of falling. Fall risk classification models were assessed for all sensor combinations and three model types: multi-layer perceptron neural network, naïve Bayesian, and support vector machine. The best performing model was a multi-layer perceptron neural network with input parameters from pressure-sensing insoles and head, pelvis, and left shank accelerometers (accuracy = 84%, F1 score = 0.600, MCC score = 0.521). Head sensor-based models had the best performance of the single-sensor models for single-task gait assessment. Single-task gait assessment models outperformed models based on dual-task walking or clinical assessment data. Support vector machines and neural networks were the best modelling technique for fall risk classification. Fall risk classification models developed for point-of-care environments should be developed using support vector machines and neural networks, with a multi-sensor single-task gait assessment.  相似文献   

14.
In a variety of applications, inertial sensors are used to estimate spatial parameters by double integrating over time their coordinate acceleration components. In human movement applications, the drift inherent to the accelerometer signals is often reduced by exploiting the cyclical nature of gait and under the hypothesis that the velocity of the sensor is zero at some point in stance. In this study, the validity of the latter hypothesis was investigated by determining the minimum velocity of progression of selected points of the foot and shank during the stance phase of the gait cycle while walking at three different speeds on level ground. The errors affecting the accuracy of the stride length estimation resulting from assuming a zero velocity at the beginning of the integration interval were evaluated on twenty healthy subjects. Results showed that the minimum velocity of the selected points on the foot and shank increased as gait speed increased. Whereas the average minimum velocity of the foot locations was lower than 0.011 m/s, the velocity of the shank locations were up to 0.049 m/s corresponding to a percent error of the stride length equal to 3.3%. The preferable foot locations for an inertial sensor resulted to be the calcaneus and the lateral aspect of the rearfoot. In estimating the stride length, the hypothesis that the velocity of the sensor can be set to zero sometimes during stance is acceptable only if the sensor is attached to the foot.  相似文献   

15.
This paper investigated application of a machine learning approach (Support vector machine, SVM) for the automatic recognition of gait changes due to ageing using three types of gait measures: basic temporal/spatial, kinetic and kinematic. The gaits of 12 young and 12 elderly participants were recorded and analysed using a synchronized PEAK motion analysis system and a force platform during normal walking. Altogether, 24 gait features describing the three types of gait characteristics were extracted for developing gait recognition models and later testing of generalization performance. Test results indicated an overall accuracy of 91.7% by the SVM in its capacity to distinguish the two gait patterns. The classification ability of the SVM was found to be unaffected across six kernel functions (linear, polynomial, radial basis, exponential radial basis, multi-layer perceptron and spline). Gait recognition rate improved when features were selected from different gait data type. A feature selection algorithm demonstrated that as little as three gait features, one selected from each data type, could effectively distinguish the age groups with 100% accuracy. These results demonstrate considerable potential in applying SVMs in gait classification for many applications.  相似文献   

16.
17.
The process of knowledge discovery from big and high dimensional datasets has become a popular research topic. The classification problem is a key task in bioinformatics, business intelligence, decision science, astronomy, physics, etc. Building associative classifiers has been a notable research interest in recent years because of their superior accuracy. In associative classifiers, using under-sampling or over-sampling methods for imbalanced big datasets reduces accuracy or increases running time, respectively. Hence, there is a significant need to create efficient associative classifiers for imbalanced big data problems. These classifiers should be able to handle challenges such as memory usage, running time and efficiently exploring the search space. To this end, efficient calculation of measures is a primary objective for associative classifiers. In this paper, we propose a new efficient associative classifier for big imbalanced datasets. The proposed method is based on Rare-PEARs (a multi-objective evolutionary algorithm that efficiently discovers rare and reliable association rules) and is able to evaluate rules in a distributed manner by using a new storing data format. This format simplifies measures calculation and is fully compatible with the MapReduce programming model. We have applied the proposed method (RPII) on a well-known big dataset (ECBDL’14) and have compared our results with seven other learning methods. The experimental results show that RPII outperform other methods in sensitivity and final score measures (the values of sensitivity and final score measures were approximately 0.74 and 0.54 respectively). The results demonstrate that the proposed method is a good candidate for large-scale classification problems; furthermore, it achieves reasonable execution time when the target platform is a typical computer clusters.  相似文献   

18.
Kudo Y  Okada Y 《Bioinformation》2011,6(5):200-203
We apply a combined method of heuristic attribute reduction and evaluation of relative reducts in rough set theory to gene expression data analysis. Our method extracts as many relative reducts as possible from the gene-expression data and selects the best relative reduct from the viewpoint of constructing useful decision rules. Using a breast cancer dataset and a leukemia dataset, we evaluated the classification accuracy for the test samples and biological meanings of the rules. As a result, our method presented superior classification accuracy comparable to existing salient classifiers. Moreover, our method extracted interesting rules including a novel biomarker gene identified in recent studies. These results indicate the possibility that our method can serve as a useful tool for gene expression data analysis.  相似文献   

19.
基于不同决策树的面向对象林区遥感影像分类比较   总被引:1,自引:0,他引:1  
陈丽萍  孙玉军 《生态学杂志》2018,29(12):3995-4003
面向地理对象影像分析技术(GEOBIA)是影像分辨率越来越高的背景下的产物.如何提高高分辨率影像分类精度和分类效率是影像处理的重要议题之一.本研究对QuickBird影像多尺度分割后的对象进行分类,分析了C5.0、C4.5、CART决策树算法在林区面向对象分类中的效率,并与kNN算法的分类精度进行比较.利用eCognition软件对遥感影像进行多尺度分割,分析得到最佳尺度为90和40.在90尺度下分离出植被和非植被后,在40尺度下提取不同类别植被的光谱、纹理、形状等共21个特征,并利用C5.0、C4.5、CART决策树算法分别对其进行知识挖掘,自动建立分类规则.最后利用建立的分类规则分别对植被区域进行分类,并比较分析其精度.结果表明: 基于决策树的分类精度均高于传统的kNN法.其中,C5.0方法的精度最高,其总体分类精度为90.0%,Kappa系数0.87.决策树算法能有效提高林区树种分类精度,且C5.0决策树的Boosting算法对该分类效果具有最明显的提升.  相似文献   

20.
MOTIVATION: The Direct Repeat (DR) locus of Mycobacterium tuberculosis is a suitable model to study (i) molecular epidemiology and (ii) the evolutionary genetics of tuberculosis. This is achieved by a DNA analysis technique (genotyping), called sp acer oligo nucleotide typing (spoligotyping ). In this paper, we investigated data analysis methods to discover intelligible knowledge rules from spoligotyping, that has not yet been applied on such representation. This processing was achieved by applying the C4.5 induction algorithm and knowledge rules were produced. Finally, a Prototype Selection (PS) procedure was applied to eliminate noisy data. This both simplified decision rules, as well as the number of spacers to be tested to solve classification tasks. In the second part of this paper, the contribution of 25 new additional spacers and the knowledge rules inferred were studied from a machine learning point of view. From a statistical point of view, the correlations between spacers were analyzed and suggested that both negative and positive ones may be related to potential structural constraints within the DR locus that may shape its evolution directly or indirectly. RESULTS: By generating knowledge rules induced from decision trees, it was shown that not only the expert knowledge may be modeled but also improved and simplified to solve automatic classification tasks on unknown patterns. A practical consequence of this study may be a simplification of the spoligotyping technique, resulting in a reduction of the experimental constraints and an increase in the number of samples processed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号