首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 62 毫秒
1.
刘玉杰  刘毅慧 《生物信息学》2011,9(3):255-258,262
特征提取和分类是模式识别中的关键问题。结合小波分析理论和支持向量机理论,构造分类器模型,将前列腺癌基因芯片数据分成癌症和正常两种。提取小波低频系数表征原始数据并送入支持向量机分类器分类,实验证明:提取db1小波4层分解下的低频系数,送入分类器分类后正确分类率达到93.53%。Haar小波的正确率是92.94%。可见提取不同小波低频系数,得到的分类效果相差不大。  相似文献   

2.
高维蛋白质波谱数据分析过程中,对于数据的特征提取一直是许多学者专注解决的问题。本文提出了一种基于高频系数的小波分析和主成份分析技术(Principal component analysis,PCA)的特征提取方法,首先采用小波分析技术对数据进行降噪,提取高频系数作为特征,之后用主成份分析技术进行降维。实验显示:本论文中提出的方法在8-7-02、4/3/02数据集上的实验识别率分别可以达到100%和99.45%,可以有效提高分类识别率。  相似文献   

3.
建立了基于小波降噪和支持向量机的结肠癌基因表达数据肿瘤识别模型.对试验数据进行小波分解,并利用交叉验证的方法计算试验样本的平均分类准确率,确定小波函数与小波分解层数;引入能量阈值方法对小波分解系数进行阈值处理,达到降噪的目的;提出了基因分类贡献率与主成分分析结合的方法,提取结肠癌样本数据特征;利用支持向量机强大的非线性映射能力,实现对结肠癌样本数据的非线性分类.为了减弱样本集的划分对分类准确率的影响,本文采取Jackknife检验方法对支持向量分类器的分类器检验,其分类准确率为96.77%.试验结果证明了该方法的有效性,该方法对结肠癌的识别具有一定的参考价值.  相似文献   

4.
目的:本文利用表面肌电(sEMG)信号来研究多种手指组合动作的识别问题。方法:在对采集的四个通道sEMG信号进行降噪预处理的基础上,采用移动加窗处理方法来提取关于手指运动状态的信号活动段,再分析各个信号活动段的小波系数统计特征,进而利用多类支持向量机(SVM)分类算法来实现手指组合动作的识别。结果:动作识别率最高达到100%。结论:所采用方法能够有效地识别多种手势动作,并为后续基于肌电信号的实时人机接口系统的研究奠定了理论基础。  相似文献   

5.
李博  李强 《生物磁学》2011,(20):3942-3945
目的:本文利用表面肌电(sEMG)信号来研究多种手指组合动作的识别问题。方法:在对采集的四个通道sEMG信号进行降噪预处理的基础上,采用移动加窗处理方法来提取关于手指运动状态的信号活动段,再分析各个信号活动段的小波系数统计特征,进而利用多类支持向量机(SVM分类算法来实现手指组合动作的识别。结果:动作识别率最高达到100%。结论:所采用方法能够有效地识别多种手势动作,并为后续基于肌电信号的实时人机接口系统的研究奠定了理论基础。  相似文献   

6.
在对候选基因进行排序时,支持向量数据描述(SVDD)可以用来描述各种异构的数据源,如序列数据、学术文献数据、各种生物实验数据等。由于生物实验数据带有噪声,在用SVDD对其描述时,会遇到噪声的影响。本研究通过公式推导扩展了原始的SVDD,提出不确定支持向量数据描述(USVDD),用来降低噪声的影响。利用酵母基因表达数据进行实验,结果表明该方法比标准的SVDD对带噪声的数据具有更好的描述能力。  相似文献   

7.
由于基因表达数据高属性维、低样本维的特点,Fisher分类器对该种数据分类性能不是很高。本文提出了Fisher的改进算法Fisher-List。该算法独特之处在于为每个类别确定一个决策阀值,每个阀值既包含总体样本信息,又含有某些对分类至关重要的个体样本信息。本文用实验证明新算法在基因表达数据分类方面比Fisher、LogitBoost、AdaBoost、k-近邻法、决策树和支持向量机具有更高的性能。  相似文献   

8.
对急性髓性白血病(AML)病人进行明确的亚型分类,有助于制定合适的治疗方案并预测其治疗效果。之前研究表明基因芯片技术在白血病亚型分类中已取得了较好效果,但由于儿童AML发病率较低,相应的芯片分析研究较少,因此目前用于构建儿童AML亚型分类模型的数据相对不足,是否可以应用现有的成人分类模型数据来对儿童AML进行预报还有待研究。应用基因芯片整合分析方法,对来自不同实验的研究成人或儿童AML亚型分类的基因芯片数据进行整合,应用支持向量机分析整合后数据集的亚型预报准确率。结果表明整合后的芯片数据在儿童AML亚型分类预报中的准确率达到97.24%,特征基因分析结果也说明在同一种AML亚型中,对于来自不同年龄组的样本,其特征基因有较高的表达相似性。  相似文献   

9.
目的:研究混合效应模型(Mixed Effects Model)在肿瘤表达谱基因芯片数据分析中的检验效能,并探讨其分析效果。方法:采用混合效应模型分析肿瘤实例基因芯片数据,并以基因集富集分析方法(GSEA)作为参照比较分析结果的有效性和科学性,探讨其检验效果。结果:通过混合效应模型和基因集富集分析(GSEA)两种方法对肿瘤基因芯片数据的分析和比较,两种方法筛选出共同的差异表达通路外,混合效应模型额外地筛选出来GSEA未能检验到的8条差异表达通路,且得到文献支持;混和效应模型筛选出的前10个差异表达通路中有6个已有生物学证明而基因集富集分析方法(GSEA)筛选出的前10个差异表达通路中仅有4个已有生物学证明。结论:混合效应模型作为top-down方法中的典型代表,其优势在于通过构建潜变量达到降维目的,可有效地减少多个复杂的变异来源从而保证了结果的准确性和科学性,其检验效能优于基因集富集分析方法(GSEA),是一种行之有效的筛选肿瘤基因芯片数据的分析方法。  相似文献   

10.
基于经验模态分解(EMD)理论,提出一种左右手运动想象脑电信号分析方法。首先利用时间窗对脑电信号数据进行划分,对每段数据通过经验模态分解法将其分解为一组固有模态函数IMF,提取主要信号所在的IMF层去除信号中的噪声。对含有主要信号的几层IMF进行Hilbert变换,得到瞬时频率与对应的瞬时幅值。再提取左右手想象的特定频段mu节律和beta节律的能量信号作为特征,分别利用支持向量机(SVM)和Fisher进行了分类比较。对EMD和小波包在去噪和特征提取进行了比较。结果表明,EMD是一种很有效的去噪方法,经过EMD分解后提取的能量信号在区分左右手想象上更具有优势,识别率高。  相似文献   

11.
A random forest method has been selected to perform both gene selection and classification of the microarray data. In this embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest classification accuracy. Hence, improved gene selection method using random forest has been proposed to obtain the smallest subset of genes as well as biggest subset of genes prior to classification. The option for biggest subset selection is done to assist researchers who intend to use the informative genes for further research. Enhanced random forest gene selection has performed better in terms of selecting the smallest subset as well as biggest subset of informative genes with lowest out of bag error rates through gene selection. Furthermore, the classification performed on the selected subset of genes using random forest has lead to lower prediction error rates compared to existing method and other similar available methods.  相似文献   

12.
Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions.  相似文献   

13.
Computational analysis is essential for transforming the masses of microarray data into a mechanistic understanding of cancer. Here we present a method for finding gene functional modules of cancer from microarray data and have applied it to colon cancer. First, a colon cancer gene network and a normal colon tissue gene network were constructed using correlations between the genes. Then the modules that tended to have a homogeneous functional composition were identified by splitting up the network. Analysis of both networks revealed that they are scale-free. Comparison of the gene functional modules for colon cancer and normal tissues showed that the modules' functions changed with their structures.  相似文献   

14.
Pok G  Liu JC  Ryu KH 《Bioinformation》2010,4(8):385-389
The microarray technique has become a standard means in simultaneously examining expression of all genes measured in different circumstances. As microarray data are typically characterized by high dimensional features with a small number of samples, feature selection needs to be incorporated to identify a subset of genes that are meaningful for biological interpretation and accountable for the sample variation. In this article, we present a simple, yet effective feature selection framework suitable for two-dimensional microarray data. Our correlation-based, nonparametric approach allows compact representation of class-specific properties with a small number of genes. We evaluated our method using publicly available experimental data and obtained favorable results.  相似文献   

15.
基于随机森林的胃癌微阵列数据基因通路分析   总被引:1,自引:0,他引:1  
将研究重点从单个基因转移到基因信号通路,结合随机森林与信号通路分析了一组胃癌微阵列数据。通过研究基因在通路中的情况以及通路中的基因对胃癌肠型、弥漫型和正常组织样本的分类能力,扩展了随机森林在生物学中的应用,为胃癌的研究提供了新的思路。  相似文献   

16.
Microarrays have thousands to tens-of-thousands of gene features, but only a few hundred patient samples are available. The fundamental problem in microarray data analysis is identifying genes whose disruption causes congenital or acquired disease in humans. In this paper, we propose a new evolutionary method that can efficiently select a subset of potentially informative genes for support vector machine (SVM) classifiers. The proposed evolutionary method uses SVM with a given subset of gene features to evaluate the fitness function, and new subsets of features are selected based on the estimates of generalization error of SVMs and frequency of occurrence of the features in the evolutionary approach. Thus, in theory, selected genes reflect to some extent the generalization performance of SVM classifiers. We compare our proposed method with several existing methods and find that the proposed method can obtain better classification accuracy with a smaller number of selected genes than the existing methods.  相似文献   

17.
Microarray data are often extremely asymmetric in dimensionality,such as thousands or even tens of thousands of genes but only a few hundreds of samples or less.Such extreme asymmetry between the dimensionality of genes and samples can lead to inaccurate diagnosis of disease in clinic.Therefore,it has been shown that selecting a small set of marker genes can lead to improved classification accuracy.In this paper,a simple modified ant colony optimization (ACO) algorithm is proposed to select tumor-related ma...  相似文献   

18.
Gene expression profiles of 14 common tumors and their counterpart normal tissues were analyzed with machine learning methods to address the problem of selection of tumor-specific genes and analysis of their differential expressions in tumor tissues. First, a variation of the Relief algorithm, “RFE_Relief algorithm” was proposed to learn the relations between genes and tissue types. Then, a support vector machine was employed to find the gene subset with the best classification performance for distinguishing cancerous tissues and their counterparts. After tissue-specific genes were removed, cross validation experiments were employed to demonstrate the common deregulated expressions of the selected gene in tumor tissues. The results indicate the existence of a specific expression fingerprint of these genes that is shared in different tumor tissues, and the hallmarks of the expression patterns of these genes in cancerous tissues are summarized at the end of this paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号