首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 93 毫秒
1.
刘玉杰  刘毅慧 《生物信息学》2011,9(3):255-258,262
特征提取和分类是模式识别中的关键问题。结合小波分析理论和支持向量机理论,构造分类器模型,将前列腺癌基因芯片数据分成癌症和正常两种。提取小波低频系数表征原始数据并送入支持向量机分类器分类,实验证明:提取db1小波4层分解下的低频系数,送入分类器分类后正确分类率达到93.53%。Haar小波的正确率是92.94%。可见提取不同小波低频系数,得到的分类效果相差不大。  相似文献   

2.
建立了基于小波降噪和支持向量机的结肠癌基因表达数据肿瘤识别模型.对试验数据进行小波分解,并利用交叉验证的方法计算试验样本的平均分类准确率,确定小波函数与小波分解层数;引入能量阈值方法对小波分解系数进行阈值处理,达到降噪的目的;提出了基因分类贡献率与主成分分析结合的方法,提取结肠癌样本数据特征;利用支持向量机强大的非线性映射能力,实现对结肠癌样本数据的非线性分类.为了减弱样本集的划分对分类准确率的影响,本文采取Jackknife检验方法对支持向量分类器的分类器检验,其分类准确率为96.77%.试验结果证明了该方法的有效性,该方法对结肠癌的识别具有一定的参考价值.  相似文献   

3.
高维蛋白质波谱数据分析过程中,对于数据的特征提取一直是许多学者专注解决的问题。本文提出了一种基于高频系数的小波分析和主成份分析技术(Principal component analysis,PCA)的特征提取方法,首先采用小波分析技术对数据进行降噪,提取高频系数作为特征,之后用主成份分析技术进行降维。实验显示:本论文中提出的方法在8-7-02、4/3/02数据集上的实验识别率分别可以达到100%和99.45%,可以有效提高分类识别率。  相似文献   

4.
高维蛋白质波谱癌症数据分析,一直面临着高维数据的困扰。针对高维蛋白质波谱癌症数据在降维过程中的问题,提出基于小波分析技术和主成分分析技术的高维蛋白质波谱癌症数据特征提取的方法,并在特征提取之后,使用支持向量机进行分类。对8-7-02数据集进行2层小波分解时,分别使用db1、db3、db4、db6、db8、db10、haar小波基,并使用支持向量机进行分类,正确率分别达到98.18%、98.35%、98.04%、98.36%、97.89%、97.96%、98.20%。在进一步提高分类识别正确率的同时,提高了时间率。  相似文献   

5.
王小兵  孙久运 《生物磁学》2011,(20):3954-3957
目的:医学影像在获取、存储、传输过程中会不同程度地受到噪声污染,这极大影像了其在临床诊疗中的应用。为了有效地滤除医学影像噪声,提出了一种混合滤波算法。方法:该算法首先将含有高斯和椒盐噪声的图像进行形态学开运算,然后对开运算后的图像进行二维小波分解,得到高频和低频小波分解系数。保留低频系数不变,将高频系数经过维纳滤波器进行滤波,最后进行小波系数重构。结果:采用该混合滤波算法、小波阚值去噪、中值滤波、维纳滤波分别对含有混合噪声的医学影像分别进行滤除噪声处理,该滤波算法去噪后影像的PSNR值明显高于其他三种方法。结论:该混合滤波算法是一种较为有效的医学影像噪声滤除方法。  相似文献   

6.
一种滤除医学影像噪声的混合滤波算法   总被引:1,自引:0,他引:1       下载免费PDF全文
目的:医学影像在获取、存储、传输过程中会不同程度地受到噪声污染,这极大影像了其在临床诊疗中的应用。为了有效地滤除医学影像噪声,提出了一种混合滤波算法。方法:该算法首先将含有高斯和椒盐噪声的图像进行形态学开运算,然后对开运算后的图像进行二维小波分解,得到高频和低频小波分解系数。保留低频系数不变,将高频系数经过维纳滤波器进行滤波,最后进行小波系数重构。结果:采用该混合滤波算法、小波阈值去噪、中值滤波、维纳滤波分别对含有混合噪声的医学影像分别进行滤除噪声处理,该滤波算法去噪后影像的PSNR值明显高于其他三种方法。结论:该混合滤波算法是一种较为有效的医学影像噪声滤除方法。  相似文献   

7.
基于小波变换的毛竹叶片净光合速率高光谱遥感反演   总被引:3,自引:0,他引:3  
在对毛竹林叶片高光谱反射率数据进行小波变换的基础上,寻找和确定最佳的小波植被指数反演毛竹林叶片的净光合速率(P_n).结果表明:理想的高频小波植被指数反演得到的P_n精度高于低频小波植被指数和光谱植被指数,其中,由小波分解第一层高频系数构建的归一化植被指数、比值植被指数和差值植被指数与P_n之间的相关性最好,R~2为0.7,均方根误差(RMSE)较低,为0.33;而低频小波植被指数反演P_n的精度低于光谱植被指数.由各层理想小波植被指数所构建的多元线性模型反演得到毛竹叶片P_n与实测P_n之间具有显著的相关关系,R~2为0.77,RMSE为0.29,且精度明显高于基于光谱植被指数所构建的多元线性模型.与光谱植被指数反演毛竹P_n的敏感波段仅局限于可见光波段相比,小波植被指数探测的敏感波长范围更广,包含了可见光及多个红外波段.高光谱数据在经过小波变换后能够发现更多反映毛竹P_n的细节信息,且整体反演精度比原始光谱有了显著提高,研究结果为基于高光谱遥感反演植被P_n提供了一种新的可选方法.  相似文献   

8.
基于经验模态分解(EMD)理论,提出一种左右手运动想象脑电信号分析方法。首先利用时间窗对脑电信号数据进行划分,对每段数据通过经验模态分解法将其分解为一组固有模态函数IMF,提取主要信号所在的IMF层去除信号中的噪声。对含有主要信号的几层IMF进行Hilbert变换,得到瞬时频率与对应的瞬时幅值。再提取左右手想象的特定频段mu节律和beta节律的能量信号作为特征,分别利用支持向量机(SVM)和Fisher进行了分类比较。对EMD和小波包在去噪和特征提取进行了比较。结果表明,EMD是一种很有效的去噪方法,经过EMD分解后提取的能量信号在区分左右手想象上更具有优势,识别率高。  相似文献   

9.
基于小波变换的土壤有机质含量高光谱估测术   总被引:2,自引:0,他引:2  
Chen HY  Zhao GX  Li XC  Zhu XC  Sui L  Wang YJ 《应用生态学报》2011,22(11):2935-2942
利用统计分析方法选取了土壤N、P、K元素含量近似而有机质含量差异较大的样本60个,通过高光谱探测分析获得样本反射率对数的一阶导数光谱,采用Bior 1.3函数进行多层离散小波分解,剔除低频近似信号和高频噪声信号,得到反映土壤理化参数的特征光谱曲线;采用相关分析筛选土壤有机质含量的显著相关波段,基于显著相关波段和特征光谱曲线分别构建土壤有机质含量高光谱多元回归估测模型;通过比较分析,确定了提取土壤有机质特征光谱的最佳小波分解尺度并构建了最佳预测模型.结果表明:提取土壤有机质特征光谱的最佳小波分解层数是9层,其次是8层和10层;基于小波9层分解特征光谱曲线的有机质含量估测模型最佳,其决定系数(R2)为0.89,比基于显著相关波段构建模型的R2增加了0.31,比基于原始光谱所构建模型的R2增加了0.10.  相似文献   

10.
基于AR模型的基因芯片数据识别   总被引:5,自引:5,他引:0  
将自回归模型(AR)模型引入基因芯片数据识别领域,提出了基于自回归模型的时间序列特征提取方法.利用动态时轴弯曲(DTW)作为分类器,在标准的肿瘤基因芯片数据的识别结果表明,本方法能够达到100%的识别率,可以应用于基因芯片数据的识别、分类和基因疾病推断。  相似文献   

11.
We present a method of data reduction using a wavelet transform in discriminant analysis when the number of variables is much greater than the number of observations. The method is illustrated with a prostate cancer study, where the sample size is 248, and the number of variables is 48,538 (generated using the ProteinChip technology). Using a discrete wavelet transform, the 48,538 data points are represented by 1271 wavelet coefficients. Information criteria identified 11 of the 1271 wavelet coefficients with the highest discriminatory power. The linear classifier with the 11 wavelet coefficients detected prostate cancer in a separate test set with a sensitivity of 97% and specificity of 100%.  相似文献   

12.
Cancer diagnosis depending on microarray technology has drawn more and more attention in the past few years. Accurate and fast diagnosis results make gene expression profiling produced from microarray widely used by a large range of researchers. Much research work highlights the importance of gene selection and gains good results. However, the minimum sets of genes derived from different methods are seldom overlapping and often inconsistent even for the same set of data, partially because of the complexity of cancer disease. In this paper, cancer classification was attempted in an alternative way of the whole gene expression profile for all samples instead of partial gene sets. Here, the three common sets of data were tested by NIPALS-KPLS method for acute leukemia, prostate cancer and lung cancer respectively. Compared to other conventional methods, the results showed wide improvement in classification accuracy. This paper indicates that sample profile of gene expression may be explored as a better indicator for cancer classification, which deserves further investigation.  相似文献   

13.
MOTIVATION: To evaluate microarray data, clustering is widely used to group biological samples or genes. However, problems arise when comparing heterologous databases. As the clustering algorithm searches for similarities between experiments, it will most likely first separate the data sets, masking relationships that exist between samples from different databases. RESULTS: We developed a program, Venn Mapper, to calculate the statistical significance of the number of co-occurring differentially expressed genes in any of the two experiments. For proof of principle, we analysed a heterologous data set of 170 microarrays including breast and prostate cancer microarray analyses. Significant overlap was found in an unsupervised analysis between metastasized prostate cancer and metastasized breast cancer and BRCA mutated breast cancer. A comparison between single microarray data and the averaged breast and prostate data sets was also evaluated. This analysis suggests that genes expressed higher in stromal cells are also implicated in metastatic prostate cancer and BRCA mutated breast cancer. The Venn Mapper program identifies overlaps between samples from heterologous data sets and directly extracts the genes responsible for the overlap. From this information novel biological hypotheses may be addressed. AVAILABILITY: Venn Mapper is freely available on http://www.erasmusmc.nl/gatcplatform. SUPPLEMENTARY INFORMATION: http://www.erasmusmc.nl/gatcplatform/vennmapper.html.  相似文献   

14.
With recent advances in mass spectrometry techniques, it is now possible to investigate proteins over a wide range of molecular weights in small biological specimens. This advance has generated data-analytic challenges in proteomics, similar to those created by microarray technologies in genetics, namely, discovery of 'signature' protein profiles specific to each pathologic state (e.g. normal vs. cancer) or differential profiles between experimental conditions (e.g. treated by a drug of interest vs. untreated) from high-dimensional data. We propose a data-analytic strategy for discovering protein biomarkers based on such high-dimensional mass spectrometry data. A real biomarker-discovery project on prostate cancer is taken as a concrete example throughout the paper: the project aims to identify proteins in serum that distinguish cancer, benign hyperplasia, and normal states of prostate using the Surface Enhanced Laser Desorption/Ionization (SELDI) technology, a recently developed mass spectrometry technique. Our data-analytic strategy takes properties of the SELDI mass spectrometer into account: the SELDI output of a specimen contains about 48,000 (x, y) points where x is the protein mass divided by the number of charges introduced by ionization and y is the protein intensity of the corresponding mass per charge value, x, in that specimen. Given high coefficients of variation and other characteristics of protein intensity measures (y values), we reduce the measures of protein intensities to a set of binary variables that indicate peaks in the y-axis direction in the nearest neighborhoods of each mass per charge point in the x-axis direction. We then account for a shifting (measurement error) problem of the x-axis in SELDI output. After this pre-analysis processing of data, we combine the binary predictors to generate classification rules for cancer, benign hyperplasia, and normal states of prostate. Our approach is to apply the boosting algorithm to select binary predictors and construct a summary classifier. We empirically evaluate sensitivity and specificity of the resulting summary classifiers with a test dataset that is independent from the training dataset used to construct the summary classifiers. The proposed method performed nearly perfectly in distinguishing cancer and benign hyperplasia from normal. In the classification of cancer vs. benign hyperplasia, however, an appreciable proportion of the benign specimens were classified incorrectly as cancer. We discuss practical issues associated with our proposed approach to the analysis of SELDI output and its application in cancer biomarker discovery.  相似文献   

15.
MOTIVATION: DNA microarray data analysis has been used previously to identify marker genes which discriminate cancer from normal samples. However, due to the limited sample size of each study, there are few common markers among different studies of the same cancer. With the rapid accumulation of microarray data, it is of great interest to integrate inter-study microarray data to increase sample size, which could lead to the discovery of more reliable markers. RESULTS: We present a novel, simple method of integrating different microarray datasets to identify marker genes and apply the method to prostate cancer datasets. In this study, by applying a new statistical method, referred to as the top-scoring pair (TSP) classifier, we have identified a pair of robust marker genes (HPN and STAT6) by integrating microarray datasets from three different prostate cancer studies. Cross-platform validation shows that the TSP classifier built from the marker gene pair, which simply compares relative expression values, achieves high accuracy, sensitivity and specificity on independent datasets generated using various array platforms. Our findings suggest a new model for the discovery of marker genes from accumulated microarray data and demonstrate how the great wealth of microarray data can be exploited to increase the power of statistical analysis. CONTACT: leixu@jhu.edu.  相似文献   

16.
The use of penalized logistic regression for cancer classification using microarray expression data is presented. Two dimension reduction methods are respectively combined with the penalized logistic regression so that both the classification accuracy and computational speed are enhanced. Two other machine-learning methods, support vector machines and least-squares regression, have been chosen for comparison. It is shown that our methods have achieved at least equal or better results. They also have the advantage that the output probability can be explicitly given and the regression coefficients are easier to interpret. Several other aspects, such as the selection of penalty parameters and components, pertinent to the application of our methods for cancer classification are also discussed.  相似文献   

17.
The use of microarray data has become quite commonplace in medical and scientific experiments. We focus here on microarray data generated from cancer studies. It is potentially important for the discovery of biomarkers to identify genes whose expression levels correlate with tumor progression. In this article, we propose a simple procedure for the identification of such genes, which we term tumor progression genes. The first stage involves estimation based on the proportional odds model. At the second stage, we calculate two quantities: a q-value, and a shrinkage estimator of the test statistic is constructed to adjust for the multiple testing problem. The relationship between the proposed method with the false discovery rate is studied. The proposed methods are applied to data from a prostate cancer microarray study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号