期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Support vector machines for detecting age-related changes in running kinematics

Fukuchi RK Eskofier BM Duarte M Ferber R 《Journal of biomechanics》2011,44(3):540-542

Age-related changes in running kinematics have been reported in the literature using classical inferential statistics. However, this approach has been hampered by the increased number of biomechanical gait variables reported and subsequently the lack of differences presented in these studies. Data mining techniques have been applied in recent biomedical studies to solve this problem using a more general approach. In the present work, we re-analyzed lower extremity running kinematic data of 17 young and 17 elderly male runners using the Support Vector Machine (SVM) classification approach. In total, 31 kinematic variables were extracted to train the classification algorithm and test the generalized performance. The results revealed different accuracy rates across three different kernel methods adopted in the classifier, with the linear kernel performing the best. A subsequent forward feature selection algorithm demonstrated that with only six features, the linear kernel SVM achieved 100% classification performance rate, showing that these features provided powerful combined information to distinguish age groups. The results of the present work demonstrate potential in applying this approach to improve knowledge about the age-related differences in running gait biomechanics and encourages the use of the SVM in other clinical contexts. 相似文献

2.

Occupancy classification of position weight matrix-inferred transcription factor binding sites

Wright H Cohen A Sönmez K Yochum G McWeeney S 《PloS one》2011,6(11):e26160

相似文献

3.

ECG beat classification using PCA,LDA, ICA and Discrete Wavelet Transform

Roshan Joy Martis U. Rajendra Acharya Lim Choo Min 《Biomedical signal processing and control》2013,8(5):437-448

Electrocardiogram (ECG) is the P-QRS-T wave, representing the cardiac function. The information concealed in the ECG signal is useful in detecting the disease afflicting the heart. It is very difficult to identify the subtle changes in the ECG in time and frequency domains. The Discrete Wavelet Transform (DWT) can provide good time and frequency resolutions and is able to decipher the hidden complexities in the ECG. In this study, five types of beat classes of arrhythmia as recommended by Association for Advancement of Medical Instrumentation (AAMI) were analyzed namely: non-ectopic beats, supra-ventricular ectopic beats, ventricular ectopic beats, fusion betas and unclassifiable and paced beats. Three dimensionality reduction algorithms; Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Independent Component Analysis (ICA) were independently applied on DWT sub bands for dimensionality reduction. These dimensionality reduced features were fed to the Support Vector Machine (SVM), neural network (NN) and probabilistic neural network (PNN) classifiers for automated diagnosis. ICA features in combination with PNN with spread value (σ) of 0.03 performed better than the PCA and LDA. It has yielded an average sensitivity, specificity, positive predictive value (PPV) and accuracy of 99.97%, 99.83%, 99.21% and 99.28% respectively using ten-fold cross validation scheme. 相似文献

4.

Predicting protein-protein interactions by combing various sequence- derived features into the general form of Chou's Pseudo amino acid composition

Zhao XW Ma ZQ Yin MH 《Protein and peptide letters》2012,19(5):492-500

Knowledge of protein-protein interactions (PPIs) plays an important role in constructing protein interaction networks and understanding the general machineries of biological systems. In this study, a new method is proposed to predict PPIs using a comprehensive set of 930 features based only on sequence information, these features measure the interactions between residues a certain distant apart in the protein sequences from different aspects. To achieve better performance, the principal component analysis (PCA) is first employed to obtain an optimized feature subset. Then, the resulting 67-dimensional feature vectors are fed to Support Vector Machine (SVM). Experimental results on Drosophila melanogaster and Helicobater pylori datasets show that our method is very promising to predict PPIs and may at least be a useful supplement tool to existing methods. 相似文献

5.

一种预测个体肿瘤的抗癌药物反应分类计算模型及其应用

下载免费PDF全文

李少达李玉双《生物化学与生物物理进展》2022,49(6):1165-1172

目的不同患者对同一抗癌药物的反应可能不同,了解患者之间对抗癌药物的反应差异对癌症精准医疗具有重大参考价值。方法高通量测序数据为构建抗癌药物反应分类预测模型提供了强大的数据支撑。针对两大经典数据集癌症细胞百科全书(CCLE)和癌症药物敏感性基因组学数据集(GDSC),本文提出了基于最大相关最小冗余(mRMR)算法和支持向量机(SVM)的计算模型mRMR-SVM。利用基因表达数据,通过方差排序和mRMR算法提取特征基因,借助SVM实现抗癌药物对细胞系的“敏感-抑制”二分类预测。结果对于CCLE中的22种药物,mRMR-SVM的平均准确率为0.904;对于GDSC中的11种药物,平均准确率为0.851。结论 mRMR-SVM不仅在预测性能方面优于传统的支持向量机、随机森林、深度反应森林、深度神经网络和细胞系-药物复杂网络模型,而且具有良好的泛化能力,对于三类特定组织的抗癌药物反应分类预测也取得了令人满意的结果。此外,mRMR-SVM可以识别与癌症发生发展密切相关的生物标志物。相似文献

6.

Using Support Vector Machine Ensembles for Target Audience Classification on Twitter

Siaw Ling Lo Raymond Chiong David Cornforth 《PloS one》2015,10(4)

The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results show that the methods presented are able to successfully identify a target audience with high accuracy. In addition, we show that using a statistical inference approach such as bootstrapping in over-sampling, instead of using random sampling, to construct training datasets can achieve a better classifier in an SVM ensemble. We conclude that such an ensemble system can take advantage of data diversity, which enables real-world applications for differentiating prospective customers from the general audience, leading to business advantage in the crowded social media space. 相似文献

7.

Combined feature selection and cancer prognosis using support vector machine regression

Sun BY Zhu ZH Li J Linghu B 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2011,8(6):1671-1677

Prognostic prediction is important in medical domain, because it can be used to select an appropriate treatment for a patient by predicting the patient's clinical outcomes. For high-dimensional data, a normal prognostic method undergoes two steps: feature selection and prognosis analysis. Recently, the L?-L?-norm Support Vector Machine (L?-L? SVM) has been developed as an effective classification technique and shown good classification performance with automatic feature selection. In this paper, we extend L?-L? SVM for regression analysis with automatic feature selection. We further improve the L?-L? SVM for prognostic prediction by utilizing the information of censored data as constraints. We design an efficient solution to the new optimization problem. The proposed method is compared with other seven prognostic prediction methods on three realworld data sets. The experimental results show that the proposed method performs consistently better than the medium performance. It is more efficient than other algorithms with the similar performance. 相似文献

8.

A Computationally Efficient Correlational Neural Network for Automated Prediction of Chronic Kidney Disease

N. Bhaskar M. Suchetha 《IRBM》2021,42(4):268-276

ObjectivesIn this paper, we propose a computationally efficient Correlational Neural Network (CorrNN) learning model and an automated diagnosis system for detecting Chronic Kidney Disease (CKD). A Support Vector Machine (SVM) classifier is integrated with the CorrNN model for improving the prediction accuracy.Material and methodsThe proposed hybrid model is trained and tested with a novel sensing module. We have monitored the concentration of urea in the saliva sample to detect the disease. Experiments are carried out to test the model with real-time samples and to compare its performance with conventional Convolutional Neural Network (CNN) and other traditional data classification methods.ResultsThe proposed method outperforms the conventional methods in terms of computational speed and prediction accuracy. The CorrNN-SVM combined network achieved a prediction accuracy of 98.67%. The experimental evaluations show a reduction in overall computation time of about 9.85% compared to the conventional CNN algorithm.ConclusionThe use of the SVM classifier has improved the capability of the network to make predictions more accurately. The proposed framework substantially advances the current methodology, and it provides more precise results compared to other data classification methods. 相似文献

9.

Classifying DNA repair genes by kernel-based support vector machines

Jiang H Ching WK 《Bioinformation》2011,7(5):257-263

相似文献

10.

Support Vector Machine Classifier for Estrogen Receptor Positive and Negative Early-Onset Breast Cancer

Rosanna Upstill-Goddard Diana Eccles Sarah Ennis Sajjad Rafiq William Tapper Joerg Fliege Andrew Collins 《PloS one》2013,8(7)

Two major breast cancer sub-types are defined by the expression of estrogen receptors on tumour cells. Cancers with large numbers of receptors are termed estrogen receptor positive and those with few are estrogen receptor negative. Using genome-wide single nucleotide polymorphism genotype data for a sample of early-onset breast cancer patients we developed a Support Vector Machine (SVM) classifier from 200 germline variants associated with estrogen receptor status (p<0.0005). Using a linear kernel Support Vector Machine, we achieved classification accuracy exceeding 93%. The model indicates that polygenic variation in more than 100 genes is likely to underlie the estrogen receptor phenotype in early-onset breast cancer. Functional classification of the genes involved identifies enrichment of functions linked to the immune system, which is consistent with the current understanding of the biological role of estrogen receptors in breast cancer. 相似文献

11.

Machine Learning for Biomedical Literature Triage

Hayda Almeida Marie-Jean Meurs Leila Kosseim Greg Butler Adrian Tsang 《PloS one》2014,9(12)

This paper presents a machine learning system for supporting the first task of the biological literature manual curation process, called triage. We compare the performance of various classification models, by experimenting with dataset sampling factors and a set of features, as well as three different machine learning algorithms (Naive Bayes, Support Vector Machine and Logistic Model Trees). The results show that the most fitting model to handle the imbalanced datasets of the triage classification task is obtained by using domain relevant features, an under-sampling technique, and the Logistic Model Trees algorithm. 相似文献

12.

An efficient malware detection approach with feature weighting based on Harris Hawks optimization

Alzubi Omar A. Alzubi Jafar A. Al-Zoubi Ala&#; M. Hassonah Mohammad A. Kose Utku 《Cluster computing》2022,25(4):2369-2387

Cluster Computing - This paper introduces and tests a novel machine learning approach to detect Android malware. The proposed approach is composed of Support Vector Machine (SVM) classifier and... 相似文献

13.

基于支持向量机的~(31)P磁共振波谱肝细胞癌诊断 总被引：1，自引：1，他引：0

付婷婷刘毅慧刘强李保朋成金勇《生物信息学》2010,8(1):20-22

支持向量机是在统计学习理论基础上发展起来的一种新的机器学习方法,在模式识别领域有着广泛的应用。利用基于支持向量机模型的31P磁共振波谱数据对肝脏进行分类,区别肝细胞癌,肝硬化和正常的肝组织。通过对基于多项式核函数和径向基核函数的支持向量机分类器进行比较,并且得到三种肝脏分类的识别率。实验表明基于31P磁共振波谱数据的支持向量机分类模型能够对活体肝脏进行诊断性的预测。相似文献

14.

Pattern recognition of number gestures based on a wireless surface EMG system

Xun Chen Z. Jane Wang 《Biomedical signal processing and control》2013,8(2):184-192

Using surface electromyography (sEMG) signal for efficient recognition of hand gestures has attracted increasing attention during the last decade, with most previous work being focused on recognition of upper arm and gross hand movements and some work on the classification of individual finger movements such as finger typing tasks. However, relatively few investigations can be found in the literature for automatic classification of multiple finger movements such as finger number gestures. This paper focuses on the recognition of number gestures based on a 4-channel wireless sEMG system. We investigate the effects of three popular feature types (i.e. Hudgins’ time–domain features (TD), autocorrelation and cross-correlation coefficients (ACCC) and spectral power magnitudes (SPM)) and four popular classification algorithms (i.e. k-nearest neighbor (k-NN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA) and support vector machine (SVM)) in offline recognition. Motivated by the good performance of SVM, we further propose combining the three features and employing a new classification method, multiple kernel learning SVM (MKL-SVM). Real sEMG results from six subjects show that all combinations, except k-NN or LDA using ACCC features, can achieve above 91% average recognition accuracy, and the highest accuracy is 97.93% achieved by the proposed MKL-SVM method using the three feature combination (3F). Referring to the offline recognition results, we also implement a real-time recognition system. Our results show that all six subjects can achieve a real-time recognition accuracy higher than 90%. The number gestures are therefore promising for practical applications such as human–computer interaction (HCI). 相似文献

15.

A comparison of methods for classifying clinical samples based on proteomics data: a case study for statistical and machine learning approaches

Sampson DL Parker TJ Upton Z Hurst CP 《PloS one》2011,6(9):e24973

The discovery of protein variation is an important strategy in disease diagnosis within the biological sciences. The current benchmark for elucidating information from multiple biological variables is the so called “omics” disciplines of the biological sciences. Such variability is uncovered by implementation of multivariable data mining techniques which come under two primary categories, machine learning strategies and statistical based approaches. Typically proteomic studies can produce hundreds or thousands of variables, p, per observation, n, depending on the analytical platform or method employed to generate the data. Many classification methods are limited by an n≪p constraint, and as such, require pre-treatment to reduce the dimensionality prior to classification. Recently machine learning techniques have gained popularity in the field for their ability to successfully classify unknown samples. One limitation of such methods is the lack of a functional model allowing meaningful interpretation of results in terms of the features used for classification. This is a problem that might be solved using a statistical model-based approach where not only is the importance of the individual protein explicit, they are combined into a readily interpretable classification rule without relying on a black box approach. Here we incorporate statistical dimension reduction techniques Partial Least Squares (PLS) and Principal Components Analysis (PCA) followed by both statistical and machine learning classification methods, and compared them to a popular machine learning technique, Support Vector Machines (SVM). Both PLS and SVM demonstrate strong utility for proteomic classification problems. 相似文献

16.

Automatic detection of epileptic EEG signals using higher order cumulant features

Acharya UR Sree SV Suri JS 《International journal of neural systems》2011,21(5):403-414

The unpredictability of the occurrence of epileptic seizures makes it difficult to detect and treat this condition effectively. An automatic system that characterizes epileptic activities in EEG signals would allow patients or the people near them to take appropriate precautions, would allow clinicians to better manage the condition, and could provide more insight into these phenomena thereby revealing important clinical information. Various methods have been proposed to detect epileptic activity in EEG recordings. Because of the nonlinear and dynamic nature of EEG signals, the use of nonlinear Higher Order Spectra (HOS) features is a seemingly promising approach. This paper presents the methodology employed to extract HOS features (specifically, cumulants) from normal, interictal, and epileptic EEG segments and to use significant features in classifiers for the detection of these three classes. In this work, 300 sets of EEG data belonging to the three classes were used for feature extraction and classifier development and evaluation. The results show that the HOS based measures have unique ranges for the different classes with high confidence level (p-value < 0.0001). On evaluating several classifiers with the significant features, it was observed that the Support Vector Machine (SVM) presented a high detection accuracy of 98.5% thereby establishing the possibility of effective EEG segment classification using the proposed technique. 相似文献

17.

Prediction of protein solvent accessibility using support vector machines

Yuan Z Burrage K Mattick JS 《Proteins》2002,48(3):566-570

A Support Vector Machine learning system has been trained to predict protein solvent accessibility from the primary structure. Different kernel functions and sliding window sizes have been explored to find how they affect the prediction performance. Using a cut-off threshold of 15% that splits the dataset evenly (an equal number of exposed and buried residues), this method was able to achieve a prediction accuracy of 70.1% for single sequence input and 73.9% for multiple alignment sequence input, respectively. The prediction of three and more states of solvent accessibility was also studied and compared with other methods. The prediction accuracies are better than, or comparable to, those obtained by other methods such as neural networks, Bayesian classification, multiple linear regression, and information theory. In addition, our results further suggest that this system may be combined with other prediction methods to achieve more reliable results, and that the Support Vector Machine method is a very useful tool for biological sequence analysis. 相似文献

18.

多特征融合的植物长链非编码RNA的预测

YAN Lingjuan CHEN Yingli YAN Dongxue FAN Zhiyu 《生物信息学》2021,19(2):128-135

长链非编码RNA(Long non-coding RNA, lncRNA)是一类被定义为转录本的长度大于200 nt、没有蛋白编码能力的RNA转录本。研究表明,lncRNA在调节植物生长发育、表观遗传反应以及各种胁迫反应中起重要作用。但是与人类和动物相比,植物lncRNA的研究仍然处于起步阶段。目前,如何从大量的转录本中准确地挑选出lncRNA仍然是植物lncRNA研究领域的重要问题之一。本文构建了新的植物lncRNA和mRNA数据集,分析了数据集中植物lncRNA的序列及结构特征,提取了序列的k-mer频数信息、二级结构信息、开放阅读框信息以及序列的几何柔性等特征,基于SVM(Support Vector Machine, SVM)算法,用Jackknife检验对植物lncRNA进行了预测,并且计算了各种特征融合后对植物lncRNA预测结果的影响,准确率达到了96.14%。相似文献

19.

Combining selectivity and affinity predictions using an integrated Support Vector Machine (SVM) approach: An alternative tool to discriminate between the human adenosine A2A and A3 receptor pyrazolo-triazolo-pyrimidine antagonists binding sites

Lisa Michielan Chiara Bolcato Stephanie Federico Barbara Cacciari Magdalena Bacilieri Karl-Norbert Klotz Sonja Kachler Giorgia Pastorin Riccardo Cardin Alessandro Sperduti Giampiero Spalluto Stefano Moro 《Bioorganic & medicinal chemistry》2009,17(14):5259-5274

相似文献

20.

Support vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence

Cai YD Lin SL 《Biochimica et biophysica acta》2003,1648(1-2):127-133

Classification of gene function remains one of the most important and demanding tasks in the post-genome era. Most of the current predictive computer methods rely on comparing features that are essentially linear to the protein sequence. However, features of a protein nonlinear to the sequence may also be predictive to its function. Machine learning methods, for instance the Support Vector Machines (SVMs), are particularly suitable for exploiting such features. In this work we introduce SVM and the pseudo-amino acid composition, a collection of nonlinear features extractable from protein sequence, to the field of protein function prediction. We have developed prototype SVMs for binary classification of rRNA-, RNA-, and DNA-binding proteins. Using a protein's amino acid composition and limited range correlation of hydrophobicity and solvent accessible surface area as input, each of the SVMs predicts whether the protein belongs to one of the three classes. In self-consistency and cross-validation tests, which measures the success of learning and prediction, respectively, the rRNA-binding SVM has consistently achieved >95% accuracy. The RNA- and DNA-binding SVMs demonstrate more diverse accuracy, ranging from approximately 76% to approximately 97%. Analysis of the test results suggests the directions of improving the SVMs. 相似文献