期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Application of Wavelet Transform for PDZ Domain Classification

Khaled Daqrouq Rami Alhmouz Ahmed Balamesh Adnan Memic 《PloS one》2015,10(4)

PDZ domains have been identified as part of an array of signaling proteins that are often unrelated, except for the well-conserved structural PDZ domain they contain. These domains have been linked to many disease processes including common Avian influenza, as well as very rare conditions such as Fraser and Usher syndromes. Historically, based on the interactions and the nature of bonds they form, PDZ domains have most often been classified into one of three classes (class I, class II and others - class III), that is directly dependent on their binding partner. In this study, we report on three unique feature extraction approaches based on the bigram and trigram occurrence and existence rearrangements within the domain''s primary amino acid sequences in assisting PDZ domain classification. Wavelet packet transform (WPT) and Shannon entropy denoted by wavelet entropy (WE) feature extraction methods were proposed. Using 115 unique human and mouse PDZ domains, the existence rearrangement approach yielded a high recognition rate (78.34%), which outperformed our occurrence rearrangements based method. The recognition rate was (81.41%) with validation technique. The method reported for PDZ domain classification from primary sequences proved to be an encouraging approach for obtaining consistent classification results. We anticipate that by increasing the database size, we can further improve feature extraction and correct classification. 相似文献

2.

Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals

Hariharan Muthusamy Kemal Polat Sazali Yaacob 《PloS one》2015,10(3)

In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature. 相似文献

3.

Feature extraction and recognition of epileptiform activity in EEG by combining PCA with ApEn

Wang C Zou J Zhang J Wang M Wang R 《Cognitive neurodynamics》2010,4(3):233-240

This paper proposes a new method for feature extraction and recognition of epileptiform activity in EEG signals. The method improves feature extraction speed of epileptiform activity without reducing recognition rate. Firstly, Principal component analysis (PCA) is applied to the original EEG for dimension reduction and to the decorrelation of epileptic EEG and normal EEG. Then discrete wavelet transform (DWT) combined with approximate entropy (ApEn) is performed on epileptic EEG and normal EEG, respectively. At last, Neyman–Pearson criteria are applied to classify epileptic EEG and normal ones. The main procedure is that the principle component of EEG after PCA is decomposed into several sub-band signals using DWT, and ApEn algorithm is applied to the sub-band signals at different wavelet scales. Distinct difference is found between the ApEn values of epileptic and normal EEG. The method allows recognition of epileptiform activities and discriminates them from the normal EEG. The algorithm performs well at epileptiform activity recognition in the clinic EEG data and offers a flexible tool that is intended to be generalized to the simultaneous recognition of many waveforms in EEG. 相似文献

4.

Application of competitive Hopfield neural network to brain-computer interface systems

Hsu WY 《International journal of neural systems》2012,22(1):51-62

We propose an unsupervised recognition system for single-trial classification of motor imagery (MI) electroencephalogram (EEG) data in this study. Competitive Hopfield neural network (CHNN) clustering is used for the discrimination of left and right MI EEG data posterior to selecting active segment and extracting fractal features in multi-scale. First, we use continuous wavelet transform (CWT) and Student's two-sample t-statistics to select the active segment in the time-frequency domain. The multiresolution fractal features are then extracted from wavelet data by means of modified fractal dimension. At last, CHNN clustering is adopted to recognize extracted features. Due to the characteristic of non-supervision, it is proper for CHNN to classify non-stationary EEG signals. The results indicate that CHNN achieves 81.9% in average classification accuracy in comparison with self-organizing map (SOM) and several popular supervised classifiers on six subjects from two data sets. 相似文献

5.

Auditory Sketches: Very Sparse Representations of Sounds Are Still Recognizable

Vincent Isnard Marine Taffou Isabelle Viaud-Delmon Clara Suied 《PloS one》2016,11(3)

Sounds in our environment like voices, animal calls or musical instruments are easily recognized by human listeners. Understanding the key features underlying this robust sound recognition is an important question in auditory science. Here, we studied the recognition by human listeners of new classes of sounds: acoustic and auditory sketches, sounds that are severely impoverished but still recognizable. Starting from a time-frequency representation, a sketch is obtained by keeping only sparse elements of the original signal, here, by means of a simple peak-picking algorithm. Two time-frequency representations were compared: a biologically grounded one, the auditory spectrogram, which simulates peripheral auditory filtering, and a simple acoustic spectrogram, based on a Fourier transform. Three degrees of sparsity were also investigated. Listeners were asked to recognize the category to which a sketch sound belongs: singing voices, bird calls, musical instruments, and vehicle engine noises. Results showed that, with the exception of voice sounds, very sparse representations of sounds (10 features, or energy peaks, per second) could be recognized above chance. No clear differences could be observed between the acoustic and the auditory sketches. For the voice sounds, however, a completely different pattern of results emerged, with at-chance or even below-chance recognition performances, suggesting that the important features of the voice, whatever they are, were removed by the sketch process. Overall, these perceptual results were well correlated with a model of auditory distances, based on spectro-temporal excitation patterns (STEPs). This study confirms the potential of these new classes of sounds, acoustic and auditory sketches, to study sound recognition. 相似文献

6.

Myocardial Infarction Detection and Localization Using Optimal Features Based Lead Specific Approach

《IRBM》2020,41(1):58-70

ObjectivesObjective of this paper is to present a reliable and accurate technique for Myocardial Infarction (MI) detection and localization.Material and methodsStationary wavelet transform has been used to decompose the ECG signal. Energy, entropy and slope based features were extracted at specific wavelet bands from selected lead of ECG. k-Nearest Neighbors (kNN) with Mahalanobis distance function has been used for classification. Sensitivity (Se), specificity (Sp), positive predictivity (+P), accuracy (Acc), and area under the receiver operating characteristics curve (AUC) analyzed over 200 subjects (52 health control, 148 with MI) from Physikalisch-Technische Bundesanstalt (PTB) database has been used for performance analysis. To handle the imbalanced data adaptive synthetic (ADASYN) sampling approach has been adopted.ResultsFor detection of MI, the proposed technique has shown an AUC = 0.99, Se = 98.62%, Sp = 99.40%, PPR = 99.41% and Acc = 99.00% using 12 top ranked features, extracted from multiple leads of ECG and AUC = 0.99, Se = 98.34%, Sp = 99.77%, PPR = 99.77% and Acc = 99.05% using 12 features extracted from a single ECG lead (i.e. lead V5). For localization of MI, the proposed technique has an AUC = 0.99, Se = 98.78%, Sp = 99.86%, PPR = 98.80%, and Acc = 99.76% using 5 top ranked features from multiple leads of ECG and AUC = 0.98, Se = 96.47%, Sp = 99.60%, PPR = 96.49% and Acc = 99.28% using 8 features extracted from a single ECG lead (i.e. lead V3).ConclusionThus for MI detection and localization, the proposed technique is independent of time-domain ECG fiducial markers and can work using specific leads of ECG. 相似文献

7.

Prediction of nitrophenol-type compounds using chemometrics and spectrophotometry

Ling Gao 《Analytical biochemistry》2010,405(2):184-149

Two chemometric methods, WPT-ERNN and least square support vector machines (LS-SVM), were developed to perform the simultaneous spectrophotometric determination of nitrophenol-type compounds with overlapping spectra. The WPT-ERNN method is based on Elman recurrent neural network (ERNN) regression combined with wavelet packet transform (WPT) preprocessing and relies on the concept of combining the idea of WPT denoising with ERNN calibration for enhancing the noise removal ability and the quality of regression without prior separation. The LS-SVM technique is capable of learning a high-dimensional feature with fewer training data and reducing the computational complexity by requiring the solution of only a set of linear equations instead of a quadratic programming problem. The relative standard errors of prediction (RSEPs) obtained for all components using WPT-ERNN, ERNN, LS-SVM, partial least squares (PLS), and multivariate linear regression (MLR) were compared. Experimental results showed that the WPT-ERNN and LS-SVM methods were successful for the simultaneous determination of nitrophenol-type compounds even when severe overlap of spectra was present. 相似文献

8.

Perceptual auditory aftereffects on voice identity using brief vowel stimuli

M Latinus P Belin 《PloS one》2012,7(7):e41384

Humans can identify individuals from their voice, suggesting the existence of a perceptual representation of voice identity. We used perceptual aftereffects - shifts in perceived stimulus quality after brief exposure to a repeated adaptor stimulus - to further investigate the representation of voice identity in two experiments. Healthy adult listeners were familiarized with several voices until they reached a recognition criterion. They were then tested on identification tasks that used vowel stimuli generated by morphing between the different identities, presented either in isolation (baseline) or following short exposure to different types of voice adaptors (adaptation). Experiment 1 showed that adaptation to a given voice induced categorization shifts away from that adaptor's identity even when the adaptors consisted of vowels different from the probe stimuli. Moreover, original voices and caricatures resulted in comparable aftereffects, ruling out an explanation of identity aftereffects in terms of adaptation to low-level features. In Experiment 2, we show that adaptors with a disrupted configuration, i.e., altered fundamental frequency or formant frequencies, failed to produce perceptual aftereffects showing the importance of the preserved configuration of these acoustical cues in the representation of voices. These two experiments indicate a high-level, dynamic representation of voice identity based on the combination of several lower-level acoustical features into a specific voice configuration. 相似文献

9.

Wavelet packet fractal analysis of neuronal morphology

Jones CL Jelinek HF 《Methods (San Diego, Calif.)》2001,24(4):347-358

An image analysis method called two-dimensional wavelet packet analysis (2D WPA) is introduced to quantify branching complexity of neurons. Both binary silhouettes and contour profiles of neurons were analyzed to determine accuracy and precision of the fractal dimension in cell classification tasks. Two-dimensional WPA plotted the slope of decay for a sorted list of discrete wavelet packet coefficients belonging to the adapted wavelet best basis to obtain the fractal dimension for test images and binary representations of neurons. Two-dimensional WPA was compared with box counting and mass-radius algorithms. The results for 2D WPA showed that it could differentiate between neural branching complexity in cells of different type in agreement with accepted methods. The importance of the 2D WPA method is that it performs multiresolution decomposition in the horizontal, vertical, and diagonal orientations. 相似文献

10.

Multi-view features fusion for birdsong classification

《Ecological Informatics》2022

As important members of the ecosystem, birds are good monitors of the ecological environment. Bird recognition, especially birdsong recognition, has attracted more and more attention in the field of artificial intelligence. At present, traditional machine learning and deep learning are widely used in birdsong recognition. Deep learning can not only classify and recognize the spectrums of birdsong, but also be used as a feature extractor. Machine learning is often used to classify and recognize the extracted birdsong handcrafted feature parameters. As the data samples of the classifier, the feature of birdsong directly determines the performance of the classifier. Multi-view features from different methods of feature extraction can obtain more perfect information of birdsong. Therefore, aiming at enriching the representational capacity of single feature and getting a better way to combine features, this paper proposes a birdsong classification model based multi-view features, which combines the deep features extracted by convolutional neural network (CNN) and handcrafted features. Firstly, four kinds of handcrafted features are extracted. Those are wavelet transform (WT) spectrum, Hilbert-Huang transform (HHT) spectrum, short-time Fourier transform (STFT) spectrum and Mel-frequency cepstral coefficients (MFCC). Then CNN is used to extract the deep features from WT, HHT and STFT spectrum, and the minimal-redundancy-maximal-relevance (mRMR) to select optimal features. Finally, three classification models (random forest, support vector machine and multi-layer perceptron) are built with the deep features and handcrafted features, and the probability of classification results of the two types of features are fused as the new features to recognize birdsong. Taking sixteen species of birds as research objects, the experimental results show that the three classifiers obtain the accuracy of 95.49%, 96.25% and 96.16% respectively for the features of the proposed method, which are better than the seven single features and three fused features involved in the experiment. This proposed method effectively combines the deep features and handcrafted features from the perspectives of signal. The fused features can more comprehensively express the information of the bird audio itself, and have higher classification accuracy and lower dimension, which can effectively improve the performance of bird audio classification. 相似文献

11.

Benefits for Voice Learning Caused by Concurrent Faces Develop over Time

Romi Z?ske Constanze Mühl Stefan R. Schweinberger 《PloS one》2015,10(11)

Recognition of personally familiar voices benefits from the concurrent presentation of the corresponding speakers’ faces. This effect of audiovisual integration is most pronounced for voices combined with dynamic articulating faces. However, it is unclear if learning unfamiliar voices also benefits from audiovisual face-voice integration or, alternatively, is hampered by attentional capture of faces, i.e., “face-overshadowing”. In six study-test cycles we compared the recognition of newly-learned voices following unimodal voice learning vs. bimodal face-voice learning with either static (Exp. 1) or dynamic articulating faces (Exp. 2). Voice recognition accuracies significantly increased for bimodal learning across study-test cycles while remaining stable for unimodal learning, as reflected in numerical costs of bimodal relative to unimodal voice learning in the first two study-test cycles and benefits in the last two cycles. This was independent of whether faces were static images (Exp. 1) or dynamic videos (Exp. 2). In both experiments, slower reaction times to voices previously studied with faces compared to voices only may result from visual search for faces during memory retrieval. A general decrease of reaction times across study-test cycles suggests facilitated recognition with more speaker repetitions. Overall, our data suggest two simultaneous and opposing mechanisms during bimodal face-voice learning: while attentional capture of faces may initially impede voice learning, audiovisual integration may facilitate it thereafter. 相似文献

12.

Selection of relevant features for EEG signal classification of schizophrenic patients

M. Sabeti R. Boostani S.D. Katebi G.W. Price 《Biomedical signal processing and control》2007,2(2):122-134

In this paper, EEG signals of 20 schizophrenic patients and 20 age-matched control participants are analyzed with the objective of determining the more informative channels and finally distinguishing the two groups. For each case, 22 channels of EEG were recorded. A two-stage feature selection algorithm is designed, such that, the more informative channels are first selected to enhance the discriminative information. Two methods, bidirectional search and plus-L minus-R (LRS) techniques are employed to select these informative channels. The interesting point is that most of selected channels are located in the temporal lobes (containing the limbic system) that confirm the neuro-phychological differences in these areas between the schizophrenic and normal participants. After channel selection, genetic algorithm (GA) is employed to select the best features from the selected channels. In this case, in addition to elimination of the less informative channels, the redundant and less discriminant features are also eliminated. A computationally fast algorithm with excellent classification results is obtained. Implementation of this efficient approach involves several features including autoregressive (AR) model parameters, band power, fractal dimension and wavelet energy. To test the performance of the final subset of features, classifiers including linear discriminant analysis (LDA) and support vector machine (SVM) are employed to classify the reduced feature set of the two groups. Using the bidirectional search for channel selection, a classification accuracy of 84.62% and 99.38% is obtained for LDA and SVM, respectively. Using the LRS technique for channel selection, a classification accuracy of 88.23% and 99.54% is also obtained for LDA and SVM, respectively. Finally, the results are compared and contrasted with two well-known methods namely, the single-stage feature selection (evolutionary feature selection) and principal component analysis (PCA)-based feature selection. The results show improved accuracy of classification in relatively low computational time with the two-stage feature selection. 相似文献

13.

Classification of epileptic seizures using wavelet packet log energy and norm entropies with recurrent Elman neural network classifier

S. Raghu N. Sriraam G. Pradeep Kumar 《Cognitive neurodynamics》2017,11(1):51-66

Electroencephalogram shortly termed as EEG is considered as the fundamental segment for the assessment of the neural activities in the brain. In cognitive neuroscience domain, EEG-based assessment method is found to be superior due to its non-invasive ability to detect deep brain structure while exhibiting superior spatial resolutions. Especially for studying the neurodynamic behavior of epileptic seizures, EEG recordings reflect the neuronal activity of the brain and thus provide required clinical diagnostic information for the neurologist. This specific proposed study makes use of wavelet packet based log and norm entropies with a recurrent Elman neural network (REN) for the automated detection of epileptic seizures. Three conditions, normal, pre-ictal and epileptic EEG recordings were considered for the proposed study. An adaptive Weiner filter was initially applied to remove the power line noise of 50 Hz from raw EEG recordings. Raw EEGs were segmented into 1 s patterns to ensure stationarity of the signal. Then wavelet packet using Haar wavelet with a five level decomposition was introduced and two entropies, log and norm were estimated and were applied to REN classifier to perform binary classification. The non-linear Wilcoxon statistical test was applied to observe the variation in the features under these conditions. The effect of log energy entropy (without wavelets) was also studied. It was found from the simulation results that the wavelet packet log entropy with REN classifier yielded a classification accuracy of 99.70 % for normal-pre-ictal, 99.70 % for normal-epileptic and 99.85 % for pre-ictal-epileptic. 相似文献

14.

基于小波包熵的运动意识任务分类研究 总被引：1，自引：0，他引：1

任亚莉《生物物理学报》2008,24(3):227-231

提出了以小波包熵作为脑电特征向量的左右手运动意识任务分类方法,对被测试者想象左右手运动时的脑电小波包熵动态变化情况及分析窗口长度的选择进行了研究.结果表明,小波包熵能很好地反映左右手运动想象的脑电特征变化,用线性判别式算法对脑电特征进行识别,分类正确率达到92.14%.由于小波包熵的计算比较简单,稳定性好,识别率高,为大脑运动意识任务的分类提供了新思路. 相似文献

15.

Voice Pathologies Classification and Detection Using EMD-DWT Analysis Based on Higher Order Statistic Features

《IRBM》2020,41(3):161-171

BackgroundThe voice is a prominent tool allowing people to communicate and to change information in their daily activities. However, any slight alteration in the voice production system may affect the voice quality. Over the last years, researchers in biomedical engineering field worked to develop a robust automatic system that may help clinicians to perform a preventive diagnosis in order to detect the voice pathologies in an early stage.MethodIn this context, pathological voice detection and classification method based on EMD-DWT analysis and Higher Order Statistics (HOS) features, is proposed. Also DWT coefficients features are extracted and tested. To carry out our experiments a wide subset of voice signal from normal subjects and subjects which suffer from the five most frequent pathologies in the Saarbrücken Voice Database (SVD), is selected. In The first step, we applied the Empirical Mode Decomposition (EMD) to the voice signal. Afterwards, among the obtained candidates of Intrinsic Mode Functions (IMFs), we choose the robust one based on temporal energy criterion. In the second step, the selected IMF was decomposed via the Discrete Wavelet Transform (DWT). As a result, two features vector includes six HOSs parameters, and a features vector includes six DWT features were formed from both approximation and detail coefficients. In order to classify the obtained data a support vector machine (SVM) is employed. After having trained the proposed system using the SVD database, the system was evaluated using voice signals of volunteer's subjects from the Neurological department of RABTA Hospital of Tunis.ResultsThe proposed method gives promising results in pathological voices detection. The accuracies reached 99.26% using HOS features and 93.1% using DWT features for SVD database. In the classification, an accuracy of 100% was reached for “Funktionelle Dysphonia vs. Rekrrensparese” based on HOS features. Nevertheless, using DWT features the accuracy achieved was 90.32% for “Hyperfunktionelle Dysphonia vs. Rekurrensparse”. Furthermore, in the validation the accuracies reached were 94.82%, 91.37% for HOS and DWT features, respectively. In the classification the highest accuracies reached were for classifying “Parkinson versus Paralysis” 94.44% and 88.87% based on HOS and DWT features, respectively.ConclusionHOS features show promising results in the automatic voice pathology detection and classification compared to DWT features. Thus, it can reliably be used as noninvasive tool to assist clinical evaluation for pathological voices identification. 相似文献

16.

Elevated recognition accuracy for low-pitched male voices in men with higher threat potential: Further evidence for the retaliation-cost model in humans

Jinguang Zhang Bin-Bin Chen Carolyn Hodges-Simeon Graham Albert Steven J.C. Gaulin Scott A. Reid 《Evolution and human behavior》2021,42(2):148-156

For humans, voice pitch is highly flexible and, when lowered, makes male speakers sound more dominant, intimidating, threatening, and likely to aggress. Importantly, pitch lowering could not have evolved as a threat signal with these effects on signal receivers unless it were honest on average. Drawing on Enquist's retaliation-cost model, we tested the hypothesis that heterosexual men high in threat potential will show enhanced memory for low-pitched male voices when mating motives were activated. Supporting this hypothesis, we found that heterosexual Chinese males higher in trait aggressiveness (Experiment 1) and heterosexual U.S. males higher in upper-body strength (Experiment 2) were more accurate in distinguishing between previously heard and unheard low- but not high-pitched male voices under a mating-motive prime. We believe that this enhanced recognition accuracy for low-pitched male voices facilitates retaliation for men with high threat potential, and thereby serves to probe the honesty of pitch lowering as an aggressive signal. 相似文献

17.

基于独立元分析和联合小波熵检测多导联ECG信号的QRS

黄博强汪源源《上海生物医学工程》2010,(1):1-6

QRS波群的准确定位是ECG信号自动分析的基础。为提高QRS检测率,提出一种基于独立元分析（ICA）和联合小波熵（CWS）检测多导联ECG信号QRS的算法。ICA算法从滤波后的多导联ECG信号中分离出对应心室活动的独立元;然后对各独立元进行连续小波变换（CWT）,重构小波系数的相空间,结合相空间中的QRS信息对独立元排序;最后检测排序后独立元的CWS得到QRS信息。实验对St.Petersburg12导联心率失常数据库及64导联犬心外膜数据库测试,比较本文算法与单导联QRS检测算法和双导联QRS检测算法的性能。结果表明,该文算法的性能最好,检测准确率分别为99.98%和100%。相似文献

18.

基于多特征融合的多分类运动想象脑电信号识别研究

下载免费PDF全文

骆金晨邹任玲姜月胡秀枋《生物信息学》2020,18(3):176-185

针对目前多分类运动想象脑电识别存在特征提取单一、分类准确率低等问题,提出一种多特征融合的四分类运动想象脑电识别方法来提高识别率。对预处理后的脑电信号分别使用希尔伯特-黄变换、一对多共空间模式、近似熵、模糊熵、样本熵提取结合时频—空域—非线性动力学的初始特征向量,用主成分分析降维,最后使用粒子群优化支持向量机分类。该算法通过对国际标准数据集BCI2005 Data set IIIa中的k3b受试者数据经MATLAB仿真处理后获得93.30%的识别率,均高于单一特征和其它组合特征下的识别率。分别对四名实验者实验采集运动想象脑电数据,使用本研究提出的方法处理获得了72.96%的平均识别率。结果表明多特征融合的特征提取方法能更好的表征运动想象脑电信号,使用粒子群支持向量机可取得较高的识别准确率,为人脑的认知活动提供了一种新的识别方法。相似文献

19.

Use of supervised discretization with PCA in wavelet packet transformation-based surface electromyogram classification

Kirkpong Kiatpanichagij Nitin Afzulpurkar 《Biomedical signal processing and control》2009,4(2):127-138

This paper describes a preprocessing stage for nonlinear classifier used in wavelet packet transformation (WPT)-based multichannel surface electromyogram (EMG) classification. The preprocessing stage named sdPCA, which consists of supervised discretization coupled with principal component analysis (PCA), was developed for improving surface EMG classifier generalization ability and training speed on overlap segmented signals. The sdPCA outperforms the fast correlation-based filter (FCBF), PCA, supervised discretization, and their combinations in terms of the highest generalization ability, fast training speed, the small feature size, and an ability to reduce the risks of developing oscillation and being trapped in nonlinear classifier training. The experiments were conducted on a data set consisting of 4-channel surface EMG signals measured from 6 hand and wrist gestures of 12 subjects. The experimental results indicate that the classification system using sdPCA has the highest generalization ability along with the second fastest training speed. The classification accuracy in 12 subjects of the system using sdPCA is 93.30 ± 2.42% taking 400 epochs for training by overlap segmented signals within 100 s. This result is very attractive for further development because we can achieve high-classification accuracy for large data sets by means of the proposed sdPCA without the application of additional algorithms such as local discriminant bases (LDB), majority voting (MV), or WPT sub-bands clustering. 相似文献

20.

Performance evaluation of a motor-imagery-based EEG-Brain computer interface using a combined cue with heterogeneous training data in BCI-Naive subjects

Donghag Choi Yeonsoo Ryu Youngbum Lee Myoungho Lee 《Biomedical engineering online》2011,10(1):1-12

Background and objective

There has been a growing interest in objective assessment of speech in dysphonic patients for the classification of the type and severity of voice pathologies using automatic speech recognition (ASR). The aim of this work was to study the accuracy of the conventional ASR system (with Mel frequency cepstral coefficients (MFCCs) based front end and hidden Markov model (HMM) based back end) in recognizing the speech characteristics of people with pathological voice.

Materials and methods

The speech samples of 62 dysphonic patients with six different types of voice disorders and 50 normal subjects were analyzed. The Arabic spoken digits were taken as an input. The distribution of the first four formants of the vowel /a/ was extracted to examine deviation of the formants from normal.

Results

There was 100% recognition accuracy obtained for Arabic digits spoken by normal speakers. However, there was a significant loss of accuracy in the classifications while spoken by voice disordered subjects. Moreover, no significant improvement in ASR performance was achieved after assessing a subset of the individuals with disordered voices who underwent treatment.

Conclusion

The results of this study revealed that the current ASR technique is not a reliable tool in recognizing the speech of dysphonic patients. 相似文献