首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 940 毫秒
1.
2.
《IRBM》2020,41(1):31-38
In this paper, a brain-computer interface (BCI) system for character recognition is proposed based on the P300 signal. A P300 speller is used to spell the word or character without any muscle movement. P300 detection is the first step to detect the character from the electroencephalogram (EEG) signal. The character is recognized from the detected P300 signal. In this paper, sparse autoencoder (SAE) and stacked sparse autoencoder (SSAE) based feature extraction methods are proposed for P300 detection. This work also proposes a fusion of deep-features with the temporal features for P300 detection. A SSAE technique extracts high-level information about input data. The combination of SSAE features with the temporal features provides abstract and temporal information about the signal. An ensemble of weighted artificial neural network (EWANN) is proposed for P300 detection to minimize the variation among different classifiers. To provide more importance to the good classifier for final classification, a higher weightage is assigned to the better performing classifier. These weights are calculated from the cross-validation test. The model is tested on two different publicly available datasets, and the proposed method provides better or comparable character recognition performance than the state-of-the-art methods.  相似文献   

3.

Background  

SpectraClassifier (SC) is a Java solution for designing and implementing Magnetic Resonance Spectroscopy (MRS)-based classifiers. The main goal of SC is to allow users with minimum background knowledge of multivariate statistics to perform a fully automated pattern recognition analysis. SC incorporates feature selection (greedy stepwise approach, either forward or backward), and feature extraction (PCA). Fisher Linear Discriminant Analysis is the method of choice for classification. Classifier evaluation is performed through various methods: display of the confusion matrix of the training and testing datasets; K-fold cross-validation, leave-one-out and bootstrapping as well as Receiver Operating Characteristic (ROC) curves.  相似文献   

4.
This study investigates the use of saliva, as an emerging diagnostic fluid in conjunction with classification techniques to discern biological heterogeneity in clinically labelled gingivitis and periodontitis subjects (80 subjects; 40/group) A battery of classification techniques were investigated as traditional single classifier systems as well as within a novel selective voting ensemble classification approach (SVA) framework. Unlike traditional single classifiers, SVA is shown to reveal patient-specific variations within disease groups, which may be important for identifying proclivity to disease progression or disease stability. Salivary expression profiles of IL-1ß, IL-6, MMP-8, and MIP-1α from 80 patients were analyzed using four classification algorithms (LDA: Linear Discriminant Analysis [LDA], Quadratic Discriminant Analysis [QDA], Naïve Bayes Classifier [NBC] and Support Vector Machines [SVM]) as traditional single classifiers and within the SVA framework (SVA-LDA, SVA-QDA, SVA-NB and SVA-SVM). Our findings demonstrate that performance measures (sensitivity, specificity and accuracy) of traditional classification as single classifier were comparable to that of the SVA counterparts using clinical labels of the samples as ground truth. However, unlike traditional single classifier approaches, the normalized ensemble vote-counts from SVA revealed varying proclivity of the subjects for each of the disease groups. More importantly, the SVA identified a subset of gingivitis and periodontitis samples that demonstrated a biological proclivity commensurate with the other clinical group. This subset was confirmed across SVA-LDA, SVA-QDA, SVA-NB and SVA-SVM. Heatmap visualization of their ensemble sets revealed lack of consensus between these subsets and the rest of the samples within the respective disease groups indicating the unique nature of the patients in these subsets. While the source of variation is not known, the results presented clearly elucidate the need for novel approaches that accommodate inherent heterogeneity and personalized variations within disease groups in diagnostic characterization. The proposed approach falls within the scope of P4 medicine (predictive, preventive, personalized, and participatory) with the ability to identify unique patient profiles that may predict specific disease trajectories and targeted disease management.  相似文献   

5.
We investigate the multiclass classification of cancer microarray samples. In contrast to classification of two cancer types from gene expression data, multiclass classification of more than two cancer types are relatively hard and less studied problem. We used class-wise optimized genes with corresponding one-versus-all support vector machine (OVA-SVM) classifier to maximize the utilization of selected genes. Final prediction was made by using probability scores from all classifiers. We used three different methods of estimating probability from decision value. Among the three probability methods, Platt's approach was more consistent, whereas, isotonic approach performed better for datasets with unequal proportion of samples in different classes. Probability based decision does not only gives true and fair comparison between different one-versus-all (OVA) classifiers but also gives the possibility of using them for any post analysis. Several ensemble experiments, an example of post analysis, of the three probability methods were implemented to study their effect in improving the classification accuracy. We observe that ensemble did help in improving the predictive accuracy of cancer data sets especially involving unbalanced samples. Four-fold external stratified cross-validation experiment was performed on the six multiclass cancer datasets to obtain unbiased estimates of prediction accuracies. Analysis of class-wise frequently selected genes on two cancer datasets demonstrated that the approach was able to select important and relevant genes consistent to literature. This study demonstrates successful implementation of the framework of class-wise feature selection and multiclass classification for prediction of cancer subtypes on six datasets.  相似文献   

6.
7.
Hayat M  Khan A  Yeasin M 《Amino acids》2012,42(6):2447-2460
Knowledge of the types of membrane protein provides useful clues in deducing the functions of uncharacterized membrane proteins. An automatic method for efficiently identifying uncharacterized proteins is thus highly desirable. In this work, we have developed a novel method for predicting membrane protein types by exploiting the discrimination capability of the difference in amino acid composition at the N and C terminus through split amino acid composition (SAAC). We also show that the ensemble classification can better exploit this discriminating capability of SAAC. In this study, membrane protein types are classified using three feature extraction and several classification strategies. An ensemble classifier Mem-EnsSAAC is then developed using the best feature extraction strategy. Pseudo amino acid (PseAA) composition, discrete wavelet analysis (DWT), SAAC, and a hybrid model are employed for feature extraction. The nearest neighbor, probabilistic neural network, support vector machine, random forest, and Adaboost are used as individual classifiers. The predicted results of the individual learners are combined using genetic algorithm to form an ensemble classifier, Mem-EnsSAAC yielding an accuracy of 92.4 and 92.2% for the Jackknife and independent dataset test, respectively. Performance measures such as MCC, sensitivity, specificity, F-measure, and Q-statistics show that SAAC-based prediction yields significantly higher performance compared to PseAA- and DWT-based systems, and is also the best reported so far. The proposed Mem-EnsSAAC is able to predict the membrane protein types with high accuracy and consequently, can be very helpful in drug discovery. It can be accessed at http://111.68.99.218/membrane.  相似文献   

8.
G-protein coupled receptor (GPCR) is a membrane protein family, which serves as an interface between cell and the outside world. They are involved in various physiological processes and are the targets of more than 50% of the marketed drugs. The function of GPCRs can be known by conducting Biological experiments. However, the rapid increase of GPCR sequences entering into databanks, it is very time consuming and expensive to determine their function based only on experimental techniques. Hence, the computational prediction of GPCRs is very much demanding for both pharmaceutical and educational research. Feature extraction of GPCRs in the proposed research is performed using three techniques i.e. Pseudo amino acid composition, Wavelet based multi-scale energy and Evolutionary information based feature extraction by utilizing the position specific scoring matrices. For classification purpose, a majority voting based ensemble method is used; whose weights are optimized using genetic algorithm. Four classifiers are used in the ensemble i.e. Nearest Neighbor, Probabilistic Neural Network, Support Vector Machine and Grey Incidence Degree. The performance of the proposed method is assessed using Jackknife test for a number of datasets. First, the individual performances of classifiers are assessed for each dataset using Jackknife test. After that, the performance for each dataset is improved by using weighted ensemble classification. The weights of ensemble are optimized using various runs of Genetic Algorithm. We have compared our method with various other methods. The significance in performance of the proposed method depicts it to be useful for GPCRs classification.  相似文献   

9.
A real-time plant species recognition under an unconstrained environment is a challenging and time-consuming process. The recognition model should cope up with the computer vision challenges such as scale variations, illumination changes, camera viewpoint or object orientation changes, cluttered backgrounds and structure of leaf (simple or compound). In this paper, a bilateral convolutional neural network (CNN) with machine learning classifiers are investigated in relation to the real-time implementation of plant species recognition. The CNN models considered are MobileNet, Xception and DenseNet-121. In the bilateral CNNs (Homogeneous/Heterogeneous type), the models are connected using the cascade early fusion strategy. The Bilateral CNN is used in the process of feature extraction. Then, the extracted features are classified using different machine learning classifiers such as Linear Discriminant Analysis (LDA), multinomial Logistic Regression (MLR), Naïve Bayes (NB), k-Nearest Neighbor (k−NN), Classification and Regression Tree (CART), Random Forest Classifier (RF), Bagging Classifier (BC), Multi-Layer Perceptron (MLP) and Support Vector Machine (SVM). From the experimental investigation, it is observed that the multinomial Logistic Regression classifier performed better compared to other classifiers, irrespective of the bilateral CNN models (Homogeneous - MoMoNet, XXNet, DeDeNet; Heterogeneous - MoXNet, XDeNet, MoDeNet). It is also observed that the MoDeNet + MLR model attained the state-of-the-art results (Flavia: 98.71%, Folio: 96.38%, Swedish Leaf: 99.41%, custom created Leaf-12: 99.39%), irrespective of the dataset. The number of misprediction/class is highly reduced by utilizing the MoDeNet + MLR model for real-time plant species recognition.  相似文献   

10.
Afridi TH  Khan A  Lee YS 《Amino acids》2012,42(4):1443-1454
Mitochondria are all-important organelles of eukaryotic cells since they are involved in processes associated with cellular mortality and human diseases. Therefore, trustworthy techniques are highly required for the identification of new mitochondrial proteins. We propose Mito-GSAAC system for prediction of mitochondrial proteins. The aim of this work is to investigate an effective feature extraction strategy and to develop an ensemble approach that can better exploit the advantages of this feature extraction strategy for mitochondria classification. We investigate four kinds of protein representations for prediction of mitochondrial proteins: amino acid composition, dipeptide composition, pseudo amino acid composition, and split amino acid composition (SAAC). Individual classifiers such as support vector machine (SVM), k-nearest neighbor, multilayer perceptron, random forest, AdaBoost, and bagging are first trained. An ensemble classifier is then built using genetic programming (GP) for evolving a complex but effective decision space from the individual decision spaces of the trained classifiers. The highest prediction performance for Jackknife test is 92.62% using GP-based ensemble classifier on SAAC features, which is the highest accuracy, reported so far on the Mitochondria dataset being used. While on the Malaria Parasite Mitochondria dataset, the highest accuracy is obtained by SVM using SAAC and it is further enhanced to 93.21% using GP-based ensemble. It is observed that SAAC has better discrimination power for mitochondria prediction over the rest of the feature extraction strategies. Thus, the improved prediction performance is largely due to the better capability of SAAC for discriminating between mitochondria and non-mitochondria proteins at the N and C terminus and the effective combination capability of GP. Mito-GSAAC can be accessed at . It is expected that the novel approach and the accompanied predictor will have a major impact to Molecular Cell Biology, Proteomics, Bioinformatics, System Biology, and Drug Development.  相似文献   

11.
Recently, ensemble learning methods have been widely used to improve classification performance in machine learning. In this paper, we present a novel ensemble learning method: argumentation based multi-agent joint learning (AMAJL), which integrates ideas from multi-agent argumentation, ensemble learning, and association rule mining. In AMAJL, argumentation technology is introduced as an ensemble strategy to integrate multiple base classifiers and generate a high performance ensemble classifier. We design an argumentation framework named Arena as a communication platform for knowledge integration. Through argumentation based joint learning, high quality individual knowledge can be extracted, and thus a refined global knowledge base can be generated and used independently for classification. We perform numerous experiments on multiple public datasets using AMAJL and other benchmark methods. The results demonstrate that our method can effectively extract high quality knowledge for ensemble classifier and improve the performance of classification.  相似文献   

12.
A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance.  相似文献   

13.
Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble’s output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) − k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer’s disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases.  相似文献   

14.
15.
Sleep apnoea is a very common sleep disorder which is able to cause symptoms such as daytime sleepiness, irritability and poor concentration. This paper presents a combinational feature extraction approach based on some nonlinear features extracted from Electro Cardio Graph (ECG) Reconstructed Phase Space (RPS) and usually used frequency domain features for detection of sleep apnoea. Here 6 nonlinear features extracted from ECG RPS are combined with 3 frequency based features to reconstruct final feature set. The nonlinear features consist of Detrended Fluctuation Analysis (DFA), Correlation Dimensions (CD), 3 Large Lyapunov Exponents (LLEs) and Spectral Entropy (SE). The final proposed feature set show about 94.8% accuracy over the Physionet sleep apnoea dataset using a kernel based SVM classifier. This research also proves that using non-linear analysis to detect sleep apnoea can potentially improve the classification accuracy of apnoea detection system.  相似文献   

16.
Yu K  Ji L 《Cytometry》2002,48(4):202-208
BACKGROUND: Comparative genomic hybridization (CGH) is a relatively new molecular cytogenetic method that detects chromosomal imbalances. Automatic karyotyping is an important step in CGH analysis because the precise position of the chromosome abnormality must be located and manual karyotyping is tedious and time-consuming. In the past, computer-aided karyotyping was done by using the 4',6-diamidino-2-phenylindole, dihydrochloride (DAPI)-inverse images, which required complex image enhancement procedures. METHODS: An innovative method, kernel nearest-neighbor (K-NN) algorithm, is proposed to accomplish automatic karyotyping. The algorithm is an application of the "kernel approach," which offers an alternative solution to linear learning machines by mapping data into a high dimensional feature space. By implicitly calculating Euclidean or Mahalanobis distance in a high dimensional image feature space, two kinds of K-NN algorithms are obtained. New feature extraction methods concerning multicolor information in CGH images are used for the first time. RESULTS: Experiment results show that the feature extraction method of using multicolor information in CGH images improves greatly the classification success rate. A high success rate of about 91.5% has been achieved, which shows that the K-NN classifier efficiently accomplishes automatic chromosome classification from relatively few samples. CONCLUSIONS: The feature extraction method proposed here and K-NN classifiers offer a promising computerized intelligent system for automatic karyotyping of CGH human chromosomes.  相似文献   

17.
To achieve high assessment accuracy for credit risk, a novel multistage deep belief network (DBN) based extreme learning machine (ELM) ensemble learning methodology is proposed. In the proposed methodology, three main stages, i.e., training subsets generation, individual classifiers training and final ensemble output, are involved. In the first stage, bagging sampling algorithm is applied to generate different training subsets for guaranteeing enough training data. Second, the ELM, an effective AI forecasting tool with the unique merits of time-saving and high accuracy, is utilized as the individual classifier, and diverse ensemble members can be accordingly formulated with different subsets and different initial conditions. In the final stage, the individual results are fused into final classification output via the DBN model with sufficient hidden layers, which can effectively capture the valuable information hidden in ensemble members. For illustration and verification, the experimental study on one publicly available credit risk dataset is conducted, and the results show the superiority of the proposed multistage DBN-based ELM ensemble learning paradigm in terms of high classification accuracy.  相似文献   

18.
19.
As important members of the ecosystem, birds are good monitors of the ecological environment. Bird recognition, especially birdsong recognition, has attracted more and more attention in the field of artificial intelligence. At present, traditional machine learning and deep learning are widely used in birdsong recognition. Deep learning can not only classify and recognize the spectrums of birdsong, but also be used as a feature extractor. Machine learning is often used to classify and recognize the extracted birdsong handcrafted feature parameters. As the data samples of the classifier, the feature of birdsong directly determines the performance of the classifier. Multi-view features from different methods of feature extraction can obtain more perfect information of birdsong. Therefore, aiming at enriching the representational capacity of single feature and getting a better way to combine features, this paper proposes a birdsong classification model based multi-view features, which combines the deep features extracted by convolutional neural network (CNN) and handcrafted features. Firstly, four kinds of handcrafted features are extracted. Those are wavelet transform (WT) spectrum, Hilbert-Huang transform (HHT) spectrum, short-time Fourier transform (STFT) spectrum and Mel-frequency cepstral coefficients (MFCC). Then CNN is used to extract the deep features from WT, HHT and STFT spectrum, and the minimal-redundancy-maximal-relevance (mRMR) to select optimal features. Finally, three classification models (random forest, support vector machine and multi-layer perceptron) are built with the deep features and handcrafted features, and the probability of classification results of the two types of features are fused as the new features to recognize birdsong. Taking sixteen species of birds as research objects, the experimental results show that the three classifiers obtain the accuracy of 95.49%, 96.25% and 96.16% respectively for the features of the proposed method, which are better than the seven single features and three fused features involved in the experiment. This proposed method effectively combines the deep features and handcrafted features from the perspectives of signal. The fused features can more comprehensively express the information of the bird audio itself, and have higher classification accuracy and lower dimension, which can effectively improve the performance of bird audio classification.  相似文献   

20.
《Genomics》2020,112(2):1282-1289
DNase I hypersensitive site (DHS) is related to DNA regulatory elements, so the understanding of DHS sites is of great significance for biomedical research. However, traditional experiments are not very good at identifying recombinant sites of a large number of emerging DNA sequences by sequencing. Some machine learning methods have been proposed to identify DHS, but most methods ignore spatial autocorrelation of the DNA sequence. In this paper, we proposed a predictor called iDHS-DSAMS to identify DHS based on the benchmark datasets. We develop a feature extraction method called dinucleotide-based spatial autocorrelation (DSA). Then we use Min-Redundancy-Max-Relevance (mRMR) to remove irrelevant and redundant features and a 100-dimensional feature vector is selected. Finally, we utilize ensemble bagged tree as classifier, which is based on the oversampled datasets using SMOTE. Five-fold cross validation tests on two benchmark datasets indicate that the proposed method outperforms its existing counterparts on the individual accuracy (Acc), Matthews correlation coefficient (MCC), sensitivity (Sn) and specificity (Sp).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号