共查询到19条相似文献,搜索用时 62 毫秒
1.
2.
建立了基于小波降噪和支持向量机的结肠癌基因表达数据肿瘤识别模型.对试验数据进行小波分解,并利用交叉验证的方法计算试验样本的平均分类准确率,确定小波函数与小波分解层数;引入能量阈值方法对小波分解系数进行阈值处理,达到降噪的目的;提出了基因分类贡献率与主成分分析结合的方法,提取结肠癌样本数据特征;利用支持向量机强大的非线性映射能力,实现对结肠癌样本数据的非线性分类.为了减弱样本集的划分对分类准确率的影响,本文采取Jackknife检验方法对支持向量分类器的分类器检验,其分类准确率为96.77%.试验结果证明了该方法的有效性,该方法对结肠癌的识别具有一定的参考价值. 相似文献
3.
蛋白质特定的三维结构与其生物功能密切相关,因此,研究蛋白质的三维结构有助于揭示其生物功能机制。将核磁共振(NMR)波谱法应用于研究溶液状态下蛋白质的三维结构,能够更加准确地揭示蛋白质结构与生物功能之间的关系。本文综述了NMR解析蛋白质三维结构的理论和技术方法,以及NMR结合其他生物物理手段,并辅以分子建模计算法研究蛋白质三维结构的研究进展和最新方法,为精准解析蛋白质的三维结构提供思路及策略。 相似文献
4.
5.
6.
7.
8.
基于氨基酸组成分布的蛋白质同源寡聚体分类研究 总被引:7,自引:0,他引:7
基于一种新的特征提取方法——氨基酸组成分布,使用支持向量机作为成员分类器,采用“一对一”的多类分类策略,从蛋白质一级序列对四类同源寡聚体进行分类研究。结果表明,在10-CV检验下,基于氨基酸组成分布,其总分类精度和精度指数分别达到了86.22%和67.12%,比基于氨基酸组成成分的传统特征提取方法分别提高了5.74和10.03个百分点,比二肽组成成分特征提取方法分别提高了3.12和5.63个百分点,说明氨基酸组成分布对于蛋白质同源寡聚体分类是一种非常有效的特征提取方法;将氨基酸组成分布和蛋白质序列长度特征组合,其总分类精度和精度指数分别达到了86.35%和67.23%,说明蛋白质序列长度特征含有一定的空间结构信息。 相似文献
9.
特征提取和分类是模式识别中的关键问题。结合小波分析理论和支持向量机理论,构造分类器模型,将前列腺癌基因芯片数据分成癌症和正常两种。提取小波低频系数表征原始数据并送入支持向量机分类器分类,实验证明:提取db1小波4层分解下的低频系数,送入分类器分类后正确分类率达到93.53%。Haar小波的正确率是92.94%。可见提取不同小波低频系数,得到的分类效果相差不大。 相似文献
10.
11.
Analysis of protein glycosylation by mass spectrometry 总被引:1,自引:0,他引:1
Bo Nilsson 《Molecular biotechnology》1994,2(3):243-280
There is a growing pharmaceutical market for protein-based drugs for use in therapy and diagnosis. The rapid developments
in molecular and cell biology have resulted in production of expression systems for manufacturing of recombinant proteins
and monoclonal antibodies. These proteins are glycosylated when expressed in cell systems with glycosylation ability. For
glycoproteins intended for therapeutic administration it is important to have knowledge about the structure of the carbohydrate
side chains to avoid cell systems that produce structures, which in humans can cause undesired reactions, e.g., immunological
and unfavorable serum clearance rate. Structural analysis of glycoprotein oligosaccharides requires sophisticated instruments
like mass spectrometers and nuclear magnetic resonance spectrometers. However, before the structural analysis can be conducted,
the carbohydrate chains have to be released from the protein and purified to homogeneity, and this is often the most time-consuming
step. Mass spectrometry has played and still plays an important role in analysis of protein glycosylation. The superior sensitivity
compared to other spectroscopic methods is its main asset. Structural analysis of carbohydrates faces several problems, however,
due to the chemical nature of the constituent monosaccharide residues. For oligosaccharides or glycoconjugates, the structural
information from mass spectrometry is essentially limited to monosaccharide sequence, molecular weight, and only in exceptional
cases glycosidic linkage positions can be obtained. In order to completely establish an oligosaccharide structure, several
other structural parameters have to be determined, e.g., linkage positions, anomeric configuration and identification of the
monosaccharide building blocks. One way to address some of these problems is to work on chemical pretreatment of the glycoconjugate,
to specifically modify the carbohydrate chain. In order to introduce specific modifications, we have used periodate oxidation
and trifluoroacetolysis with the objective of determining glycosidic linkage positions by mass spectrometry. 相似文献
12.
蛋白质质谱技术是蛋白质组学的重要研究工具,它被出色地应用于癌症早期诊断等领域,但是蛋白质质谱数据带来的维灾难问题使得降维成为质谱分析的必需的步骤。本文首先将美国国家癌症研究所提供的高分辨率SELDI—TOF卵巢质谱数据进行预处理;然后将质谱数据的特征选择问题转化成基于模拟退火算法的组合优化模型,用基于线性判别式分析的分类错误率和样本后验概率构造待优化目标函数,用基于均匀分布和控制参数的方法构造新解产生器,在退火过程中添加记忆功能;然后用10-fold交叉验证法选择训练和测试样本,用线性判别式分析分类器评价降维后的质谱数据。实验证明,用模拟退火算法选择6个以上特征时,能够将高分辨率SELDI—TOF卵巢质谱数据全部正确分类,说明模拟退火算法可以很好地应用于蛋白质质谱数据的特征选择。 相似文献
13.
The identification of new diagnostic or prognostic biomarkers is one of the main aims of clinical cancer research. In recent years, there has been a growing interest in using mass spectrometry for the detection of such biomarkers. The MS signal resulting from MALDI‐TOF measurements is contaminated by different sources of technical variations that can be removed by a prior pre‐processing step. In particular, denoising makes it possible to remove the random noise contained in the signal. Wavelet methodology associated with thresholding is usually used for this purpose. In this study, we adapted two multivariate denoising methods that combine wavelets and PCA to MS data. The objective was to obtain better denoising of the data so as to extract the meaningful proteomic biological information from the raw spectra and reach meaningful clinical conclusions. The proposed methods were evaluated and compared with the classical soft thresholding denoising method using both real and simulated data sets. It was shown that taking into account common structures of the signals by adding a dimension reduction step on approximation coefficients through PCA provided more effective denoising when combined with soft thresholding on detail coefficients. 相似文献
14.
Kadlík V Strohalm M Kodícek M 《Biochemical and biophysical research communications》2003,305(4):1091-1093
Lysine epsilon -amino group reacts with citraconic anhydride forming a derivative, which is stable on terms for trypsin cleavage. This modification changes the spectrum of peptides formed by the trypsin action; as the number of trypsin-sensitive sites is reduced, the peptides with higher molecular mass can survive in the digest. The various studies of proteins by MALDI-TOF mass spectrometry are often complicated by the low sequence coverage of the peptide chain. This paper demonstrates that the modification of proteins by citraconylation before trypsin cleavage represents a simple experimental technique, which allows a significant increase of sequence coverage in MALDI-TOF mass spectrometry. This improvement is caused both by change of trypsin fragmentation pattern and by disturbance of the protein's native tertiary structure. 相似文献
15.
In this work, the commonly used algorithms for mass spectrometry based protein identification, Mascot, MS-Fit, ProFound and SEQUEST, were studied in respect to the selectivity and sensitivity of their searches. The influence of various search parameters were also investigated. Approximately 6600 searches were performed using different search engines with several search parameters to establish a statistical basis. The applied mass spectrometric data set was chosen from a current proteome study. The huge amount of data could only be handled with computational assistance. We present a software solution for fully automated triggering of several peptide mass fingerprinting (PMF) and peptide fragmentation fingerprinting (PFF) algorithms. The development of this high-throughput method made an intensive evaluation based on data acquired in a typical proteome project possible. Previous evaluations of PMF and PFF algorithms were mainly based on simulations. 相似文献
16.
Rita Grandori Carlo Santambrogio Stefania Brocca Gaetano Invernizzi Marina Lotti Professor 《Biotechnology journal》2009,4(1):73-87
Since the early 1990s, electrospray-ionization mass spectrometry (ESI-MS) has encountered growing interest as a complementary tool to established biochemical and biophysical methods for investigating protein structure and conformation. Nowadays, applications of ESI-MS to protein investigation span from the area of analytical biochemistry to that of structural biology. This review focuses on applications of this technique to the analysis of protein conformational properties and molecular interactions, underscoring their possible relevance for molecular biotechnology, although representing a still very young field. An introductive section presents the major issues related to theoretical and technical aspects of ESI-MS under non-denaturing conditions. Examples from our work and from the literature illustrate which kind of information can be obtained concerning key issues in biotechnology such as stability and aggregation of proteins under both near-native and challenging conditions, and interactions with other proteins, ligands and cofactors. 相似文献
17.
The discovery of functional protein complex and the interrogation of the complex structure-function relationship (SFR) play crucial roles in the understanding and intervention of biological processes. Affinity purification-mass spectrometry (AP-MS) has been proved as a powerful tool in the discovery of protein complexes. However, validation of these novel protein complexes as well as elucidation of their molecular interaction mechanisms are still challenging. Recently, native top-down MS (nTDMS) is rapidly developed for the structural analysis of protein complexes. In this review, we discuss the integration of AP-MS and nTDMS in the discovery and structural characterization of functional protein complexes. Further, we think the emerging artificial intelligence (AI)-based protein structure prediction is highly complementary to nTDMS and can promote each other. We expect the hybridization of integrated structural MS with AI prediction to be a powerful workflow in the discovery and SFR investigation of functional protein complexes. 相似文献
18.
As a useful alternative to Cox's proportional hazard model, the additive risk model assumes that the hazard function is the sum of the baseline hazard function and the regression function of covariates. This article is concerned with estimation and prediction for the additive risk models with right censored survival data, especially when the dimension of the covariates is comparable to or larger than the sample size. Principal component regression is proposed to give unique and numerically stable estimators. Asymptotic properties of the proposed estimators, component selection based on the weighted bootstrap, and model evaluation techniques are discussed. This approach is illustrated with analysis of the primary biliary cirrhosis clinical data and the diffuse large B-cell lymphoma genomic data. It is shown that this methodology is numerically stable and effective in dimension reduction, while still being able to provide satisfactory prediction and classification results. 相似文献
19.
Imaging mass spectrometry (IMS) is two-dimensional mass spectrometry to visualize the spatial distribution of biomolecules, which does not need either separation or purification of target molecules, and enables us to monitor not only the identification of unknown molecules but also the localization of numerous molecules simultaneously. Among the ionization techniques, matrix assisted laser desorption/ionization (MALDI) is one of the most generally used for IMS, which allows the analysis of numerous biomolecules ranging over wide molecular weights. Proper selection and preparation of matrix is essential for successful imaging using IMS. Tandem mass spectrometry, which is referred to MSn, enables the structural analysis of a molecule detected by the first step of IMS. Applications of IMS were initially developed for studying proteins or peptides. At present, however, targets of IMS research have expanded to the imaging of small endogenous metabolites such as lipids, exogenous drug pharmacokinetics, exploring new disease markers, and other new scientific fields. We hope that this new technology will open a new era for biophysics. 相似文献