期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Bayesian network approach to feature selection in mass spectrometry data

Karl W Kuschner Dariya I Malyarenko William E Cooke Lisa H Cazares OJ Semmes Eugene R Tracy 《BMC bioinformatics》2010,11(1):177

Background

Time-of-flight mass spectrometry (TOF-MS) has the potential to provide non-invasive, high-throughput screening for cancers and other serious diseases via detection of protein biomarkers in blood or other accessible biologic samples. Unfortunately, this potential has largely been unrealized to date due to the high variability of measurements, uncertainties in the distribution of proteins in a given population, and the difficulty of extracting repeatable diagnostic markers using current statistical tools. With studies consisting of perhaps only dozens of samples, and possibly hundreds of variables, overfitting is a serious complication. To overcome these difficulties, we have developed a Bayesian inductive method which uses model-independent methods of discovering relationships between spectral features. This method appears to efficiently discover network models which not only identify connections between the disease and key features, but also organizes relationships between features--and furthermore creates a stable classifier that categorizes new data at predicted error rates. 相似文献

2.

Evaluation of algorithms for protein identification from sequence databases using mass spectrometry data

Chamrad DC Körting G Stühler K Meyer HE Klose J Blüggel M 《Proteomics》2004,4(3):619-628

In this work, the commonly used algorithms for mass spectrometry based protein identification, Mascot, MS-Fit, ProFound and SEQUEST, were studied in respect to the selectivity and sensitivity of their searches. The influence of various search parameters were also investigated. Approximately 6600 searches were performed using different search engines with several search parameters to establish a statistical basis. The applied mass spectrometric data set was chosen from a current proteome study. The huge amount of data could only be handled with computational assistance. We present a software solution for fully automated triggering of several peptide mass fingerprinting (PMF) and peptide fragmentation fingerprinting (PFF) algorithms. The development of this high-throughput method made an intensive evaluation based on data acquired in a typical proteome project possible. Previous evaluations of PMF and PFF algorithms were mainly based on simulations. 相似文献

3.

Identification of biomarkers from mass spectrometry data using a "common" peak approach

Tadayoshi Fushiki Hironori Fujisawa Shinto Eguchi 《BMC bioinformatics》2006,7(1):358-9

Background

Proteomic data obtained from mass spectrometry have attracted great interest for the detection of early-stage cancer. However, as mass spectrometry data are high-dimensional, identification of biomarkers is a key problem. 相似文献

4.

Comparison of algorithms for pre-processing of SELDI-TOF mass spectrometry data

Cruz-Marcelo A Guerra R Vannucci M Li Y Lau CC Man TK 《Bioinformatics (Oxford, England)》2008,24(19):2129-2136

MOTIVATION: Surface-enhanced laser desorption and ionization (SELDI) time of flight (TOF) is a mass spectrometry technology. The key features in a mass spectrum are its peaks. In order to locate the peaks and quantify their intensities, several pre-processing steps are required. Though different approaches to perform pre-processing have been proposed, there is no systematic study that compares their performance. RESULTS: In this article, we present the results of a systematic comparison of various popular packages for pre-processing of SELDI-TOF data. We evaluate their performance in terms of two of their primary functions: peak detection and peak quantification. Regarding peak quantification, the performance of the algorithms is measured in terms of reproducibility. For peak detection, the comparison is based on sensitivity and false discovery rate. Our results show that for spectra generated with low laser intensity, the software developed by Ciphergen Biosystems (ProteinChip Software 3.1 with the additional tool Biomarker Wizard) produces relatively good results for both peak quantification and detection. On the other hand, for the data produced with either medium or high laser intensity, none of the methods show uniformly better performances under both criteria. Our analysis suggests that an advantageous combination is the use of the packages MassSpecWavelet and PROcess, the former for peak detection and the latter for peak quantification. 相似文献

5.

高维蛋白质波谱癌症数据特征提取

下载免费PDF全文

吴文峰刘毅慧《生物信息学》2015,13(2):131-140

高维蛋白质波谱癌症数据分析,一直面临着高维数据的困扰。针对高维蛋白质波谱癌症数据在降维过程中的问题,提出基于小波分析技术和主成分分析技术的高维蛋白质波谱癌症数据特征提取的方法,并在特征提取之后,使用支持向量机进行分类。对8-7-02数据集进行2层小波分解时,分别使用db1、db3、db4、db6、db8、db10、haar小波基,并使用支持向量机进行分类,正确率分别达到98.18%、98.35%、98.04%、98.36%、97.89%、97.96%、98.20%。在进一步提高分类识别正确率的同时,提高了时间率。相似文献

6.

Clustering mass spectrometry data using order statistics

Slotta DJ Heath LS Ramakrishnan N Helm R Potts M 《Proteomics》2003,3(9):1687-1691

Mass spectrometry data is inherently uncertain. Rather than compare peak heights across samples, a comparison can be made of the relative ordering of the peak height across samples. Order statistics are used to provide a distance metric between each ordered list of peak heights from the samples. A principal component analysis is performed on the set of distance vectors to highlight to important components. 相似文献

7.

Calibration using constrained smoothing with applications to mass spectrometry data

Xingdong Feng Nell Sedransk Jessie Q. Xia 《Biometrics》2014,70(2):398-408

相似文献

8.

swissPIT: a novel approach for pipelined analysis of mass spectrometry data

Quandt A Hernandez P Masselot A Hernandez C Maffioletti S Pautasso C Appel RD Lisacek F 《Bioinformatics (Oxford, England)》2008,24(11):1416-1417

The identification and characterization of peptides from tandem mass spectrometry (MS/MS) data represents a critical aspect of proteomics. Today, tandem MS analysis is often performed by only using a single identification program achieving identification rates between 10-50% (Elias and Gygi, 2007). Beside the development of new analysis tools, recent publications describe also the pipelining of different search programs to increase the identification rate (Hartler et al., 2007; Keller et al., 2005). The Swiss Protein Identification Toolbox (swissPIT) follows this approach, but goes a step further by providing the user an expandable multi-tool platform capable of executing workflows to analyze tandem MS-based data. One of the major problems in proteomics is the absent of standardized workflows to analyze the produced data. This includes the pre-processing part as well as the final identification of peptides and proteins. The main idea of swissPIT is not only the usage of different identification tool in parallel, but also the meaningful concatenation of different identification strategies at the same time. The swissPIT is open source software but we also provide a user-friendly web platform, which demonstrates the capabilities of our software and which is available at http://swisspit.cscs.ch upon request for account. 相似文献

9.

Complexities and algorithms for glycan sequencing using tandem mass spectrometry

Shan B Ma B Zhang K Lajoie G 《Journal of bioinformatics and computational biology》2008,6(1):77-91

Determining glycan structures is vital to comprehend cell-matrix, cell-cell, and even intracellular biological events. Glycan sequencing, which determines the primary structure of a glycan using tandem mass spectrometry (MS/MS), remains one of the most important tasks in proteomics. Analogous to peptide de novo sequencing, glycan de novo sequencing determines the structure without the aid of a known glycan database. We show in this paper that glycan de novo sequencing is NP-hard. We then provide a heuristic algorithm and develop a software program to solve the problem in practical cases. Experiments on real MS/MS data of glycopeptides demonstrate that our heuristic algorithm gives satisfactory results on practical data. 相似文献

10.

Comparison of public peak detection algorithms for MALDI mass spectrometry data analysis

Chao Yang Zengyou He Weichuan Yu 《BMC bioinformatics》2009,10(1):4

Background

In mass spectrometry (MS) based proteomic data analysis, peak detection is an essential step for subsequent analysis. Recently, there has been significant progress in the development of various peak detection algorithms. However, neither a comprehensive survey nor an experimental comparison of these algorithms is yet available. The main objective of this paper is to provide such a survey and to compare the performance of single spectrum based peak detection methods. 相似文献

11.

Nonparametric estimation of natural selection on a quantitative trait using mark-recapture data

Gimenez O Covas R Brown CR Anderson MD Brown MB Lenormand T 《Evolution; international journal of organic evolution》2006,60(3):460-466

Assessing natural selection on a phenotypic trait in wild populations is of primary importance for evolutionary ecologists. To cope with the imperfect detection of individuals inherent to monitoring in the wild, we develop a nonparametric method for evaluating the form of natural selection on a quantitative trait using mark-recapture data. Our approach uses penalized splines to achieve flexibility in exploring the form of natural selection by avoiding the need to specify an a priori parametric function. If needed, it can help in suggesting a new parametric model. We employ Markov chain Monte Carlo sampling in a Bayesian framework to estimate model parameters. We illustrate our approach using data for a wild population of sociable weavers (Philetairus socius) to investigate survival in relation to body mass. In agreement with previous parametric analyses, we found that lighter individuals showed a reduction in survival. However, the survival function was not symmetric, indicating that body mass might not be under stabilizing selection as suggested previously. 相似文献

12.

Preprocessing, classification modeling and feature selection using flow injection electrospray mass spectrometry metabolite fingerprint data

Enot DP Lin W Beckmann M Parker D Overy DP Draper J 《Nature protocols》2008,3(3):446-470

Metabolome analysis by flow injection electrospray mass spectrometry (FIE-MS) fingerprinting generates measurements relating to large numbers of m/z signals. Such data sets often exhibit high variance with a paucity of replicates, thus providing a challenge for data mining. We describe data preprocessing and modeling methods that have proved reliable in projects involving samples from a range of organisms. The protocols interact with software resources specifically for metabolomics provided in a Web-accessible data analysis package FIEmspro (http://users.aber.ac.uk/jhd) written in the R environment and requiring a moderate knowledge of R command-line usage. Specific emphasis is placed on describing the outcome of modeling experiments using FIE-MS data that require further preprocessing to improve quality. The salient features of both poor and robust (i.e., highly generalizable) multivariate models are outlined together with advice on validating classifiers and avoiding false discovery when seeking explanatory variables. 相似文献

13.

Modeling of protein binary complexes using structural mass spectrometry data

Kamal JK Chance MR 《Protein science : a publication of the Protein Society》2008,17(1):79-94

In this article, we describe a general approach to modeling the structure of binary protein complexes using structural mass spectrometry data combined with molecular docking. In the first step, hydroxyl radical mediated oxidative protein footprinting is used to identify residues that experience conformational reorganization due to binding or participate in the binding interface. In the second step, a three-dimensional atomic structure of the complex is derived by computational modeling. Homology modeling approaches are used to define the structures of the individual proteins if footprinting detects significant conformational reorganization as a function of complex formation. A three-dimensional model of the complex is constructed from these binary partners using the ClusPro program, which is composed of docking, energy filtering, and clustering steps. Footprinting data are used to incorporate constraints-positive and/or negative-in the docking step and are also used to decide the type of energy filter-electrostatics or desolvation-in the successive energy-filtering step. By using this approach, we examine the structure of a number of binary complexes of monomeric actin and compare the results to crystallographic data. Based on docking alone, a number of competing models with widely varying structures are observed, one of which is likely to agree with crystallographic data. When the docking steps are guided by footprinting data, accurate models emerge as top scoring. We demonstrate this method with the actin/gelsolin segment-1 complex. We also provide a structural model for the actin/cofilin complex using this approach which does not have a crystal or NMR structure. 相似文献

14.

A machine learning approach to explore the spectra intensity pattern of peptides using tandem mass spectrometry data

Cong Zhou Lucas D Bowler Jianfeng Feng 《BMC bioinformatics》2008,9(1):325

Background

A better understanding of the mechanisms involved in gas-phase fragmentation of peptides is essential for the development of more reliable algorithms for high-throughput protein identification using mass spectrometry (MS). Current methodologies depend predominantly on the use of derived m/z values of fragment ions, and, the knowledge provided by the intensity information present in MS/MS spectra has not been fully exploited. Indeed spectrum intensity information is very rarely utilized in the algorithms currently in use for high-throughput protein identification. 相似文献

15.

The estimation of glutarate in urine using gas chromatography mass spectrometry with an internal isotopic standard

C R Lee R J Pollitt 《Biochemical medicine》1972,6(6):536-542

相似文献

16.

Quantitative approach to single-nucleotide polymorphism analysis using MALDI-TOF mass spectrometry 总被引：7，自引：0，他引：7

Ross P Hall L Haff LA 《BioTechniques》2000,29(3):620-6, 628-9

Pooling of DNA samples before genotyping is a valuable means of streamlining large-scale genotyping efforts in disease association studies, single-nucleotide polymorphism (SNP) validation or mutant allele screening programs. In this report, we explore the application of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) to quantitative analysis of SNPs. The measurements are based on MALDI-TOF MS analysis of primer extension assays performed on standard mixtures of pooled PCR products at several test loci. The inherent high molecular weight resolution of MALDI-TOF MS conveys high specificity and good signal-to-noise ratio for performing accurate quantitation. The methods described maximize the sensitivity and quantitative capacity of MALDI-TOF MS while preserving the throughput and economic advantages of the MALDI-TOF platform. Using the format described, we demonstrate that allele frequencies as low as 5% can be detected quantitatively and unambiguously. 相似文献

17.

Anatomical distribution of lipid species in rodent brain using imaging mass spectrometry

E. González de San Román A. Veloso G. Barreda-Gómez E. Astigarraga M.T. Giralt B. Ochoa O. Fresnedo J.A. Fernández-González R. Rodríguez-Puertas 《Chemistry and physics of lipids》2010

相似文献

18.

基于模拟退火算法的高分辨率蛋白质质谱数据特征选择

李义峰刘毅慧《生物信息学》2009,7(2):85-90

蛋白质质谱技术是蛋白质组学的重要研究工具,它被出色地应用于癌症早期诊断等领域,但是蛋白质质谱数据带来的维灾难问题使得降维成为质谱分析的必需的步骤。本文首先将美国国家癌症研究所提供的高分辨率SELDI—TOF卵巢质谱数据进行预处理;然后将质谱数据的特征选择问题转化成基于模拟退火算法的组合优化模型,用基于线性判别式分析的分类错误率和样本后验概率构造待优化目标函数,用基于均匀分布和控制参数的方法构造新解产生器,在退火过程中添加记忆功能;然后用10-fold交叉验证法选择训练和测试样本,用线性判别式分析分类器评价降维后的质谱数据。实验证明,用模拟退火算法选择6个以上特征时,能够将高分辨率SELDI—TOF卵巢质谱数据全部正确分类,说明模拟退火算法可以很好地应用于蛋白质质谱数据的特征选择。相似文献

19.

Quantitation of multisite EGF receptor phosphorylation using mass spectrometry and a novel normalization approach

Boeri Erba E Matthiesen R Bunkenborg J Schulze WX Di Stefano P Cabodi S Tarone G Defilippi P Jensen ON 《Journal of proteome research》2007,6(7):2768-2785

Using stable isotope labeling and mass spectrometry, we performed a sensitive, quantitative analysis of multiple phosphorylation sites of the epidermal growth factor (EGF) receptor. Phosphopeptide detection efficiency was significantly improved by using the tyrosine phosphatase inhibitor sodium pervanadate to boost the abundance of phosphorylation of the EGF receptor. Nine phosphorylation sites (pT669, pS967, pS1002, pY845, pY974, pY1045, pY1086, pY1148, and pY1173) of EGF receptor were quantified from EGF-stimulated cells in suspension and adherent conditions. Our data sets revealed that EGF stimulation of adherent cells induced higher levels of tyrosine phosphorylation relative to EGF stimulation of suspended cells. In contrast, EGF stimulation of adherent cells induced lower levels of serine and threonine phosphorylation relative to EGF stimulation of suspended cells. These findings are consistent with the hypothesis that cellular adhesion modulates phosphorylation of plasma membrane receptor tyrosine kinases relevant for EGF-induced signal transduction processes. Furthermore, our results suggest that strong phosphatase inhibitors should be used to generate reference datasets in comparative phosphoproteomics experiments. 相似文献

20.

Direct on-membrane glycoproteomic approach using MALDI-TOF mass spectrometry coupled with microdispensing of multiple enzymes

Kimura S Kameyama A Nakaya S Ito H Narimatsu H 《Journal of proteome research》2007,6(7):2488-2494

We report a novel approach for direct on-membrane glycoproteomics by digestion of membrane-blotted glycoproteins with multiple enzymes using piezoelectric chemical inkjet printing technology and on-membrane direct MALDI-TOF mass spectrometry. With this approach, both N-linked glycan analyses and peptide mass fingerprinting of several standard glycoproteins were successfully performed using PNGase F and trypsin microscale digestions of the blotted spots on membrane from an SDS-PAGE gel. In addition, we performed a similar analysis for 2-DE separated serum glycoproteins as a demonstration of how the system could be used in human plasma glycoproteomics. 相似文献