期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

吴文峰刘毅慧《生物信息学》2015,13(3):198-204

高维蛋白质波谱数据分析过程中,对于数据的特征提取一直是许多学者专注解决的问题。本文提出了一种基于高频系数的小波分析和主成份分析技术(Principal component analysis,PCA)的特征提取方法,首先采用小波分析技术对数据进行降噪,提取高频系数作为特征,之后用主成份分析技术进行降维。实验显示:本论文中提出的方法在8-7-02、4/3/02数据集上的实验识别率分别可以达到100%和99.45%,可以有效提高分类识别率。相似文献

2.

基因表达数据小波降噪与肿瘤识别

杨振华孟军《生物数学学报》2012,(3):555-562

建立了基于小波降噪和支持向量机的结肠癌基因表达数据肿瘤识别模型.对试验数据进行小波分解,并利用交叉验证的方法计算试验样本的平均分类准确率,确定小波函数与小波分解层数;引入能量阈值方法对小波分解系数进行阈值处理,达到降噪的目的;提出了基因分类贡献率与主成分分析结合的方法,提取结肠癌样本数据特征;利用支持向量机强大的非线性映射能力,实现对结肠癌样本数据的非线性分类.为了减弱样本集的划分对分类准确率的影响,本文采取Jackknife检验方法对支持向量分类器的分类器检验,其分类准确率为96.77%.试验结果证明了该方法的有效性,该方法对结肠癌的识别具有一定的参考价值. 相似文献

3.

核磁共振波谱法在蛋白质三维结构解析中的应用

尹林申峻丞杨立群《生物化学与生物物理进展》2022,49(7):1273-1290

蛋白质特定的三维结构与其生物功能密切相关,因此,研究蛋白质的三维结构有助于揭示其生物功能机制。将核磁共振（NMR）波谱法应用于研究溶液状态下蛋白质的三维结构,能够更加准确地揭示蛋白质结构与生物功能之间的关系。本文综述了NMR解析蛋白质三维结构的理论和技术方法,以及NMR结合其他生物物理手段,并辅以分子建模计算法研究蛋白质三维结构的研究进展和最新方法,为精准解析蛋白质的三维结构提供思路及策略。相似文献

4.

基于 RBF 神经网络的蛋白质二级结构预测

张斌尹京苑薛丹《生物信息学》2011,9(3):224-228,234

蛋白质二级结构对于研究其功能具有重要作用。采用主成分分析方法对氨基酸的基本物化属性及其二级结构倾向性进行降维降噪处理,使用径向基神经网络对蛋白质二级结构进行预测。主成分分析使得之前 20 ×12 矩阵变为 20 ×4 矩阵,极大地减少了神经网络输入端的维数。在仿真过程中,当窗口大小为 21,扩展函数为 7 时,预测精确度达到了 71. 81%。实验结果表明 RBF 神经网络可以有效的用于蛋白质二级结构的预测。相似文献

5.

一种快速非比对的蛋白质序列相似性与进化分析方法

下载免费PDF全文

艾亮冯杰《生物信息学》2023,21(3):179-186

本文提出了一种新的快速非比对的蛋白质序列相似性与进化分析方法。在刻画蛋白质序列特征时,首先将氨基酸的10种理化性质通过主成分分析浓缩为6个主成分,并且将每条蛋白质序列里的氨基酸数目作为权重对主成分得分值进行加权平均,然后再融合氨基酸的位置信息构成一个26维的蛋白质序列特征向量,最后利用欧式距离度量蛋白质序列间的相似性及进化关系。通过对3个蛋白质序列数据集的测试表明,本文提出的方法能将每条蛋白质序列准确聚类,并且简便快捷,说明了该方法的有效性。相似文献

6.

PCA和ICA对MRI图像的降维效果分析

杨国城黄志伟曹高飞《生物技术世界》2012,(4):8+10

本文用PCA-SVM和ICA-SVM以及单个SVM方法对同一MRI实验数据进行降维处理,分析这三种方法在MRI图像上的降维效果。研究发现:PCA-SVM方法适用于大规模实验数组,其判别性能接近于原数组单个SVM模型,并大幅度地减少了运算时间。相似文献

7.

基于LNMF的癌症基因表达谱数据的特征提取

王蕊平王年苏亮亮陈乐《生物信息学》2011,9(2):164-166,170

海量数据的存在是现代信息社会的一大特点,如何在成千上万的基因中有效地选出样本的分类特征对癌症的诊治具有重要意义。采用局部非负矩阵分解方法对癌症基因表达谱数据进行特征提取。首先对基因表达谱数据进行筛选,然后构造局部非负矩阵并对其进行分解得到维数低、能充分表征样本的特征向量,最后用支持向量机对特征向量进行分类。结果表明该方法的可行性和有效性。相似文献

8.

基于氨基酸组成分布的蛋白质同源寡聚体分类研究 总被引：7，自引：0，他引：7

施建宇潘泉张绍武程咏梅《生物物理学报》2006,22(1):49-56

基于一种新的特征提取方法——氨基酸组成分布,使用支持向量机作为成员分类器,采用“一对一”的多类分类策略,从蛋白质一级序列对四类同源寡聚体进行分类研究。结果表明,在10-CV检验下,基于氨基酸组成分布,其总分类精度和精度指数分别达到了86.22%和67.12%,比基于氨基酸组成成分的传统特征提取方法分别提高了5.74和10.03个百分点,比二肽组成成分特征提取方法分别提高了3.12和5.63个百分点,说明氨基酸组成分布对于蛋白质同源寡聚体分类是一种非常有效的特征提取方法;将氨基酸组成分布和蛋白质序列长度特征组合,其总分类精度和精度指数分别达到了86.35%和67.23%,说明蛋白质序列长度特征含有一定的空间结构信息。相似文献

9.

基于小波低频系数基因芯片数据的特征提取

刘玉杰刘毅慧《生物信息学》2011,9(3):255-258,262

特征提取和分类是模式识别中的关键问题。结合小波分析理论和支持向量机理论,构造分类器模型,将前列腺癌基因芯片数据分成癌症和正常两种。提取小波低频系数表征原始数据并送入支持向量机分类器分类,实验证明:提取db1小波4层分解下的低频系数,送入分类器分类后正确分类率达到93.53%。Haar小波的正确率是92.94%。可见提取不同小波低频系数,得到的分类效果相差不大。相似文献

10.

多通道脑电信号的盲分离

游荣义徐慎初陈忠《生物物理学报》2004,20(1):77-82

提出一种新的多通道脑电信号盲分离的方法,将小波变换和独立分量分析(independent component analysis,ICA)相结合,利用小波变换的滤噪作用,将混合在原始脑电的部分高频噪声滤除后,再重构原始脑电作为ICA的输入信号,有效地克服了现有ICA算法不能区分噪声的缺陷。实验结果表明,该方法对多通道脑电的盲分离是很有效的。相似文献

11.

Analysis of protein glycosylation by mass spectrometry 总被引：1，自引：0，他引：1

Bo Nilsson 《Molecular biotechnology》1994,2(3):243-280

There is a growing pharmaceutical market for protein-based drugs for use in therapy and diagnosis. The rapid developments in molecular and cell biology have resulted in production of expression systems for manufacturing of recombinant proteins and monoclonal antibodies. These proteins are glycosylated when expressed in cell systems with glycosylation ability. For glycoproteins intended for therapeutic administration it is important to have knowledge about the structure of the carbohydrate side chains to avoid cell systems that produce structures, which in humans can cause undesired reactions, e.g., immunological and unfavorable serum clearance rate. Structural analysis of glycoprotein oligosaccharides requires sophisticated instruments like mass spectrometers and nuclear magnetic resonance spectrometers. However, before the structural analysis can be conducted, the carbohydrate chains have to be released from the protein and purified to homogeneity, and this is often the most time-consuming step. Mass spectrometry has played and still plays an important role in analysis of protein glycosylation. The superior sensitivity compared to other spectroscopic methods is its main asset. Structural analysis of carbohydrates faces several problems, however, due to the chemical nature of the constituent monosaccharide residues. For oligosaccharides or glycoconjugates, the structural information from mass spectrometry is essentially limited to monosaccharide sequence, molecular weight, and only in exceptional cases glycosidic linkage positions can be obtained. In order to completely establish an oligosaccharide structure, several other structural parameters have to be determined, e.g., linkage positions, anomeric configuration and identification of the monosaccharide building blocks. One way to address some of these problems is to work on chemical pretreatment of the glycoconjugate, to specifically modify the carbohydrate chain. In order to introduce specific modifications, we have used periodate oxidation and trifluoroacetolysis with the objective of determining glycosidic linkage positions by mass spectrometry. 相似文献

12.

基于模拟退火算法的高分辨率蛋白质质谱数据特征选择

李义峰刘毅慧《生物信息学》2009,7(2):85-90

蛋白质质谱技术是蛋白质组学的重要研究工具,它被出色地应用于癌症早期诊断等领域,但是蛋白质质谱数据带来的维灾难问题使得降维成为质谱分析的必需的步骤。本文首先将美国国家癌症研究所提供的高分辨率SELDI—TOF卵巢质谱数据进行预处理;然后将质谱数据的特征选择问题转化成基于模拟退火算法的组合优化模型,用基于线性判别式分析的分类错误率和样本后验概率构造待优化目标函数,用基于均匀分布和控制参数的方法构造新解产生器,在退火过程中添加记忆功能;然后用10-fold交叉验证法选择训练和测试样本,用线性判别式分析分类器评价降维后的质谱数据。实验证明,用模拟退火算法选择6个以上特征时,能够将高分辨率SELDI—TOF卵巢质谱数据全部正确分类,说明模拟退火算法可以很好地应用于蛋白质质谱数据的特征选择。相似文献

13.

Multivariate denoising methods combining wavelets and principal component analysis for mass spectrometry data

Elise Mostacci Caroline Truntzer Hervé Cardot Patrick Ducoroy 《Proteomics》2010,10(14):2564-2572

The identification of new diagnostic or prognostic biomarkers is one of the main aims of clinical cancer research. In recent years, there has been a growing interest in using mass spectrometry for the detection of such biomarkers. The MS signal resulting from MALDI‐TOF measurements is contaminated by different sources of technical variations that can be removed by a prior pre‐processing step. In particular, denoising makes it possible to remove the random noise contained in the signal. Wavelet methodology associated with thresholding is usually used for this purpose. In this study, we adapted two multivariate denoising methods that combine wavelets and PCA to MS data. The objective was to obtain better denoising of the data so as to extract the meaningful proteomic biological information from the raw spectra and reach meaningful clinical conclusions. The proposed methods were evaluated and compared with the classical soft thresholding denoising method using both real and simulated data sets. It was shown that taking into account common structures of the signals by adding a dimension reduction step on approximation coefficients through PCA provided more effective denoising when combined with soft thresholding on detail coefficients. 相似文献

14.

Citraconylation--a simple method for high protein sequence coverage in MALDI-TOF mass spectrometry

Kadlík V Strohalm M Kodícek M 《Biochemical and biophysical research communications》2003,305(4):1091-1093

Lysine epsilon -amino group reacts with citraconic anhydride forming a derivative, which is stable on terms for trypsin cleavage. This modification changes the spectrum of peptides formed by the trypsin action; as the number of trypsin-sensitive sites is reduced, the peptides with higher molecular mass can survive in the digest. The various studies of proteins by MALDI-TOF mass spectrometry are often complicated by the low sequence coverage of the peptide chain. This paper demonstrates that the modification of proteins by citraconylation before trypsin cleavage represents a simple experimental technique, which allows a significant increase of sequence coverage in MALDI-TOF mass spectrometry. This improvement is caused both by change of trypsin fragmentation pattern and by disturbance of the protein's native tertiary structure. 相似文献

15.

Evaluation of algorithms for protein identification from sequence databases using mass spectrometry data

Chamrad DC Körting G Stühler K Meyer HE Klose J Blüggel M 《Proteomics》2004,4(3):619-628

In this work, the commonly used algorithms for mass spectrometry based protein identification, Mascot, MS-Fit, ProFound and SEQUEST, were studied in respect to the selectivity and sensitivity of their searches. The influence of various search parameters were also investigated. Approximately 6600 searches were performed using different search engines with several search parameters to establish a statistical basis. The applied mass spectrometric data set was chosen from a current proteome study. The huge amount of data could only be handled with computational assistance. We present a software solution for fully automated triggering of several peptide mass fingerprinting (PMF) and peptide fragmentation fingerprinting (PFF) algorithms. The development of this high-throughput method made an intensive evaluation based on data acquired in a typical proteome project possible. Previous evaluations of PMF and PFF algorithms were mainly based on simulations. 相似文献

16.

Electrospray-ionization mass spectrometry as a tool for fast screening of protein structural properties

Rita Grandori Carlo Santambrogio Stefania Brocca Gaetano Invernizzi Marina Lotti Professor 《Biotechnology journal》2009,4(1):73-87

Since the early 1990s, electrospray-ionization mass spectrometry (ESI-MS) has encountered growing interest as a complementary tool to established biochemical and biophysical methods for investigating protein structure and conformation. Nowadays, applications of ESI-MS to protein investigation span from the area of analytical biochemistry to that of structural biology. This review focuses on applications of this technique to the analysis of protein conformational properties and molecular interactions, underscoring their possible relevance for molecular biotechnology, although representing a still very young field. An introductive section presents the major issues related to theoretical and technical aspects of ESI-MS under non-denaturing conditions. Examples from our work and from the literature illustrate which kind of information can be obtained concerning key issues in biotechnology such as stability and aggregation of proteins under both near-native and challenging conditions, and interactions with other proteins, ligands and cofactors. 相似文献

17.

Integrated mass spectrometry strategy for functional protein complex discovery and structural characterization

《Current opinion in chemical biology》2023

The discovery of functional protein complex and the interrogation of the complex structure-function relationship (SFR) play crucial roles in the understanding and intervention of biological processes. Affinity purification-mass spectrometry (AP-MS) has been proved as a powerful tool in the discovery of protein complexes. However, validation of these novel protein complexes as well as elucidation of their molecular interaction mechanisms are still challenging. Recently, native top-down MS (nTDMS) is rapidly developed for the structural analysis of protein complexes. In this review, we discuss the integration of AP-MS and nTDMS in the discovery and structural characterization of functional protein complexes. Further, we think the emerging artificial intelligence (AI)-based protein structure prediction is highly complementary to nTDMS and can promote each other. We expect the hybridization of integrated structural MS with AI prediction to be a powerful workflow in the discovery and SFR investigation of functional protein complexes. 相似文献

18.

Additive risk models for survival data with high-dimensional covariates

Ma S Kosorok MR Fine JP 《Biometrics》2006,62(1):202-210

As a useful alternative to Cox's proportional hazard model, the additive risk model assumes that the hazard function is the sum of the baseline hazard function and the regression function of covariates. This article is concerned with estimation and prediction for the additive risk models with right censored survival data, especially when the dimension of the covariates is comparable to or larger than the sample size. Principal component regression is proposed to give unique and numerically stable estimators. Asymptotic properties of the proposed estimators, component selection based on the weighted bootstrap, and model evaluation techniques are discussed. This approach is illustrated with analysis of the primary biliary cirrhosis clinical data and the diffuse large B-cell lymphoma genomic data. It is shown that this methodology is numerically stable and effective in dimension reduction, while still being able to provide satisfactory prediction and classification results. 相似文献

19.

Imaging mass spectrometry: principle and application

Chihiro Murayama Yoshishige Kimura Mitsutoshi Setou 《Biophysical reviews》2009,1(3):131-139

Imaging mass spectrometry (IMS) is two-dimensional mass spectrometry to visualize the spatial distribution of biomolecules, which does not need either separation or purification of target molecules, and enables us to monitor not only the identification of unknown molecules but also the localization of numerous molecules simultaneously. Among the ionization techniques, matrix assisted laser desorption/ionization (MALDI) is one of the most generally used for IMS, which allows the analysis of numerous biomolecules ranging over wide molecular weights. Proper selection and preparation of matrix is essential for successful imaging using IMS. Tandem mass spectrometry, which is referred to MSⁿ, enables the structural analysis of a molecule detected by the first step of IMS. Applications of IMS were initially developed for studying proteins or peptides. At present, however, targets of IMS research have expanded to the imaging of small endogenous metabolites such as lipids, exogenous drug pharmacokinetics, exploring new disease markers, and other new scientific fields. We hope that this new technology will open a new era for biophysics. 相似文献