首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
数据非依赖采集(data-independent acquisition,DIA)是一种高通量、无偏性的质谱数据采集方法,具有定量结果重现性好,对低丰度蛋白质友好的特点,是近年来进行大队列蛋白质组研究的首选方法之一。由于DIA产生的二级谱是混合谱,包含了多个肽段的碎片离子信息,使得蛋白质鉴定和定量更加困难。目前,DIA数据分析方法分为两大类,即以肽为中心和以谱图为中心。其中,以肽为中心的分析方法鉴定更灵敏,定量更准确,已成为DIA数据解析的主流方法。其分析流程包括构建谱图库、提取色谱峰群、特征打分和结果质控4个关键步骤。本文综述了以肽为中心的DIA数据分析流程,介绍了基于此流程的数据分析软件及相关比较评估工作,进一步总结了已有的算法改进工作,最后对未来发展方向进行了展望。  相似文献   

2.
Ongoing optimization of proteomic methodologies seeks to improve both the coverage and confidence of protein identifications. The optimization of sample preparation, inclusion of technical replicates (repeated instrumental analysis of the same sample), and biological replicates (multiple individual samples) are crucial in proteomic studies to avoid the pitfalls associated with single point analysis and under-sampling. Phosphopeptides were isolated from HeLa cells and analyzed by nano-reversed phase liquid chromatography electrospray ionization tandem mass spectrometry (nano-RP-LC-MS/MS). We observed that a detergent-based protein extraction approach, followed with additional steps for nucleic acid removal, provided a simple alternative to the broadly used Trizol extraction. The evaluation of four technical replicates demonstrated measurement reproducibility with low percent variance in peptide responses at approximately 3%, where additional peptide identifications were made with each added technical replicate. The inclusion of six technical replicates for moderately complex protein extracts (approximately 4000 uniquely identified peptides per data set) affords the optimal collection of peptide information.  相似文献   

3.
The random forest classification method was applied to classify samples from 76 breast cancer patients and 77 controls whose proteomic profile had been obtained using mass spectrometry. The analysis consisted of two stages, the detection of peaks from the profiles and the construction of a classification rule using random forests. Using a peak detection method based on finding common local maxima in the smoothed sample spectra, 444 peaks were detected, reducing to 365 robust peaks found in at least 7 out of 10 random subsets of samples. Subjects were classified as cases or controls using the random forest algorithm applied to the 365 peaks. Based on the prediction of the status of out-of-bag samples, the total error rate was 16.3%, with a sensitivity of 81.6% and a specificity of 85.7%. Measures of importance of each of the peaks were calculated to identify regions of the spectrum influencing the classification, and the four most important peaks were identified as mz3863_13, mz2943_12, mz3193_44 and mz8925_94. Combining initial peak detection with the random forest algorithm provides a high-performance classification system for proteomic data, with unbiased estimates of future performance.  相似文献   

4.
Gel-free proteomics has emerged as a complement to conventional gel-based proteomics. Gel-free approaches focus on peptide or protein fractionation, but they do not address the efficiency of protein processing. We report the development of a microfluidic proteomic reactor that greatly simplifies the processing of complex proteomic samples by combining multiple proteomic steps. Rapid extraction and enrichment of proteins from complex proteomic samples or directly from cells are readily performed on the reactor. Furthermore, chemical and enzymatic treatments of proteins are performed in 50 nL effective volume, which results in an increased number of generated peptides. The products are compatible with mass spectrometry. We demonstrated that the proteomic reactor is at least 10 times more sensitive than current gel-free methodologies with one protein identified per 440 pg of protein lysate injected on the reactor. Furthermore, as little as 300 cells can be directly introduced on the proteomic reactor and analyzed by mass spectrometry.  相似文献   

5.
In recent years, mass spectrometry has become one of the core technologies for high throughput proteomic profiling in biomedical research. However, reproducibility of the results using this technology was in question. It has been realized that sophisticated automatic signal processing algorithms using advanced statistical procedures are needed to analyze high resolution and high dimensional proteomic data, e.g., Matrix-Assisted Laser Desorption/Ionization Time-of-Flight (MALDI-TOF) data. In this paper we present a software package-pkDACLASS based on R which provides a complete data analysis solution for users of MALDITOF raw data. Complete data analysis comprises data preprocessing, monoisotopic peak detection through statistical model fitting and testing, alignment of the monoisotopic peaks for multiple samples and classification of the normal and diseased samples through the detected peaks. The software provides flexibility to the users to accomplish the complete and integrated analysis in one step or conduct analysis as a flexible platform and reveal the results at each and every step of the analysis. AVAILABILITY: The database is available for free at http://cran.r-project.org/web/packages/pkDACLASS/index.html.  相似文献   

6.
Archived formalin-fixed paraffin-embedded (FFPE) tissue collections represent a valuable informational resource for proteomic studies. Multiple FFPE core biopsies can be assembled in a single block to form tissue microarrays (TMAs). We describe a protocol for analyzing protein in FFPE-TMAs using matrix-assisted laser desorption/ionization (MALDI) imaging mass spectrometry (IMS). The workflow incorporates an antigen retrieval step following deparaffinization, in situ trypsin digestion, matrix application and then mass spectrometry signal acquisition. The direct analysis of FFPE-TMA tissue using IMS allows direct analysis of multiple tissue samples in a single experiment without extraction and purification of proteins. The advantages of high speed and throughput, easy sample handling and excellent reproducibility make this technology a favorable approach for the proteomic analysis of clinical research cohorts with large sample numbers. For example, TMA analysis of 300 FFPE cores would typically require 6 h of total time through data acquisition, not including data analysis.  相似文献   

7.
The annual Spring Workshop of the HUPO‐PSI took place in Korea, where the Mass Spectrometry and Protein Separations groups joined forces to tackle the issue of the consistent reporting of quantitative proteomic data generated by mass‐spectrometry‐based technologies. A preliminary mzQuantML schema was drafted which, when completed and tested, will complement the existing mzIdentML schema for reporting protein identifications. The Molecular Interactions group concentrated on the implementations of the PSICQUIC (PSI Common Query InterfaCe) service that allows users to simultaneously query interaction data across multiple participating resources. Work was also undertaken to update the MIAPE guidelines, in response to feedback from the editors of a number of proteomic journals.  相似文献   

8.
With the development of high-resolution and high-throughput mass spectrometry(MS)technology, a large quantum of proteomic data is continually being generated. Collecting and sharing these data are a challenge that requires immense and sustained human effort. In this report, we provide a classification of important web resources for MS-based proteomics and present rating of these web resources, based on whether raw data are stored, whether data submission is supported,and whether data analysis pipelines are provided. These web resources are important for biologists involved in proteomics research.  相似文献   

9.
The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite data sets.  相似文献   

10.

Background

As a promising way to transform medicine, mass spectrometry based proteomics technologies have seen a great progress in identifying disease biomarkers for clinical diagnosis and prognosis. However, there is a lack of effective feature selection methods that are able to capture essential data behaviors to achieve clinical level disease diagnosis. Moreover, it faces a challenge from data reproducibility, which means that no two independent studies have been found to produce same proteomic patterns. Such reproducibility issue causes the identified biomarker patterns to lose repeatability and prevents it from real clinical usage.

Methods

In this work, we propose a novel machine-learning algorithm: derivative component analysis (DCA) for high-dimensional mass spectral proteomic profiles. As an implicit feature selection algorithm, derivative component analysis examines input proteomics data in a multi-resolution approach by seeking its derivatives to capture latent data characteristics and conduct de-noising. We further demonstrate DCA's advantages in disease diagnosis by viewing input proteomics data as a profile biomarker via integrating it with support vector machines to tackle the reproducibility issue, besides comparing it with state-of-the-art peers.

Results

Our results show that high-dimensional proteomics data are actually linearly separable under proposed derivative component analysis (DCA). As a novel multi-resolution feature selection algorithm, DCA not only overcomes the weakness of the traditional methods in subtle data behavior discovery, but also suggests an effective resolution to overcoming proteomics data's reproducibility problem and provides new techniques and insights in translational bioinformatics and machine learning. The DCA-based profile biomarker diagnosis makes clinical level diagnostic performances reproducible across different proteomic data, which is more robust and systematic than the existing biomarker discovery based diagnosis.

Conclusions

Our findings demonstrate the feasibility and power of the proposed DCA-based profile biomarker diagnosis in achieving high sensitivity and conquering the data reproducibility issue in serum proteomics. Furthermore, our proposed derivative component analysis suggests the subtle data characteristics gleaning and de-noising are essential in separating true signals from red herrings for high-dimensional proteomic profiles, which can be more important than the conventional feature selection or dimension reduction. In particular, our profile biomarker diagnosis can be generalized to other omics data for derivative component analysis (DCA)'s nature of generic data analysis.
  相似文献   

11.
Establishment of a near-standard two-dimensional human urine proteomic map   总被引:9,自引:0,他引:9  
Oh J  Pyo JH  Jo EH  Hwang SI  Kang SC  Jung JH  Park EK  Kim SY  Choi JY  Lim J 《Proteomics》2004,4(11):3485-3497
A proteomic map for human urine on two-dimensional (2-D) gels has been developed. Initial studies demonstrated that the urine proteins prepared by conventional methods showed interference and poor reproducibility in 2-D electrophoresis (2-DE). To address this issue, urine samples were dialyzed to remove any interfering molecules. The dialysis of urine proteins and the concentration by lyophilization without fractionation significantly improved the reproducibility and resolution and likely represents the total urine proteins on a 2-D gel. In addition, removing albumin from urine using Affi-Gel Blue helped to identify the low-abundant proteins. Using the developed method, we prepared proteins from urine collected from healthy females and males. The large inter- and intra-subject variation in protein profiles on 2-D gels made it difficult to establish a normal human urine proteomic 2-D map. To resolve this problem, urinary proteins were prepared from the pooled urine collected from 20 healthy females and males, respectively. The established male and female urine proteomes separated on 2-D gels were almost identical except for some potential sex-dependent protein spots. We have annotated 113 different proteins on the 2-D gel by peptide mass fingerprinting (PMF). We propose that the established total urine proteome can be used for 2-DE analysis, liquid chromatography-tandem mass spectrometry (LC-MS/MS), and identification of novel disease-specific biomarkers.  相似文献   

12.
Time-Of-Flight Mass Spectrometry (TOF-SIMS) was used to determine elemental and biomolecular ions from isolated protein samples. We identified a set of 23 mass-to-charge ratio (m/z) peaks that represent signatures for distinguishing biological samples. The 23 peaks were identified by Singular Value Decomposition (SVD) and Canonical Analysis (CA) to find the underlying structure in the complex mass-spectra data sets. From this modified data, SVD was used to identify sets of m/z peaks, and we used these patterns from the TOF-SIMS data to predict the biological source from which individual mass spectra were generated. The signatures were validated using an additional data set different from the initial training set used to identify the signatures. We present a simple method to identify multiple variables required for sample classification based on mass spectra that avoids overfit. This is important in a variety of studies using mass spectrometry, including the ability to identify proteins in complex mixtures and for the identification of new biomarkers.  相似文献   

13.
To take advantage of the potential quantitative benefits offered by tandem mass spectrometry, we have modified the method in which tandem mass spectrum data are acquired in 'shotgun' proteomic analyses. The proposed method is not data dependent and is based on the sequential isolation and fragmentation of precursor windows (of 10 m/z) within the ion trap until a desired mass range has been covered. We compared the quantitative figures of merit for this method to those for existing strategies by performing an analysis of the soluble fraction of whole-cell lysates from yeast metabolically labeled in vivo with (15)N. To automate this analysis, we modified software (RelEx) previously written in the Yates lab to generate chromatograms directly from tandem mass spectra. These chromatograms showed improvements in signal-to-noise ratio of approximately three- to fivefold over corresponding chromatograms generated from mass spectrometry scans. In addition, to demonstrate the utility of the data-independent acquisition strategy coupled with chromatogram reconstruction from tandem mass spectra, we measured protein expression levels in two developmental stages of Caenorhabditis elegans.  相似文献   

14.
MALDI mass spectrometry can generate profiles that contain hundreds of biomolecular ions directly from tissue. Spatially-correlated analysis, MALDI imaging MS, can simultaneously reveal how each of these biomolecular ions varies in clinical tissue samples. The use of statistical data analysis tools to identify regions containing correlated mass spectrometry profiles is referred to as imaging MS-based molecular histology because of its ability to annotate tissues solely on the basis of the imaging MS data. Several reports have indicated that imaging MS-based molecular histology may be able to complement established histological and histochemical techniques by distinguishing between pathologies with overlapping/identical morphologies and revealing biomolecular intratumor heterogeneity. A data analysis pipeline that identifies regions of imaging MS datasets with correlated mass spectrometry profiles could lead to the development of novel methods for improved diagnosis (differentiating subgroups within distinct histological groups) and annotating the spatio-chemical makeup of tumors. Here it is demonstrated that highlighting the regions within imaging MS datasets whose mass spectrometry profiles were found to be correlated by five independent multivariate methods provides a consistently accurate summary of the spatio-chemical heterogeneity. The corroboration provided by using multiple multivariate methods, efficiently applied in an automated routine, provides assurance that the identified regions are indeed characterized by distinct mass spectrometry profiles, a crucial requirement for its development as a complementary histological tool. When simultaneously applied to imaging MS datasets from multiple patient samples of intermediate-grade myxofibrosarcoma, a heterogeneous soft tissue sarcoma, nodules with mass spectrometry profiles found to be distinct by five different multivariate methods were detected within morphologically identical regions of all patient tissue samples. To aid the further development of imaging MS based molecular histology as a complementary histological tool the Matlab code of the agreement analysis, instructions and a reduced dataset are included as supporting information.  相似文献   

15.
The presence of numerous proteomics data and their results in literature reveal the importance and influence of proteins and peptides on human cell cycle. For instance, the proteomic profiling of biological samples, such as serum, plasma or cells, and their organelles, carried out by surface-enhanced laser desorption/ionization mass spectrometry, has led to the discovery of numerous key proteins involved in many biological disease processes. However, questions still remain regarding the reproducibility, bioinformatic artifacts and cross-validations of such experimental set-ups. The authors have developed a material-based approach, termed material-enhanced laser desorption/ionization mass spectrometry (MELDI-MS), to facilitate and improve the robustness of large-scale proteomic experiments. MELDI-MS includes a fully automated protein-profiling platform, from sample preparation and analysis to data processing involving state-of-the-art methods, which can be further improved. Multiplexed protein pattern analysis, based on material morphology, physical characteristics and chemical functionalities provides a multitude of protein patterns and allows prostate cancer samples to be distinguished from non-prostate cancer samples. Furthermore, MELDI-MS enables not only the analysis of protein signatures, but also the identification of potential discriminating peaks via capillary liquid chromatography mass spectrometry. The optimized MELDI approach offers a complete proteomics platform with improved sensitivity, selectivity and short sample preparation times.  相似文献   

16.
Protein separation by two-dimensional gel electrophoresis is of central importance for proteomics. Upon combination with systematic protein identifications by mass spectrometry, large data sets are routinely generated in several proteome laboratories which can be used as "reference maps" for future analyses of analogous biochemical fractions. Here we present GelMap, a novel software tool for the building presentation and evaluation of proteomic reference maps. Variable frames are introduced in order to group proteins into functional categories on three levels or into categories according to differential abundance during comparative proteome analyses. The software is easy to handle as it only requires uploading two digital files to a web site. An additional file including detailed information on all proteins can be combined with the primary map. Two different gel-based projects are presented to illustrate the capacity of GelMap for proteome annotation and evaluation.  相似文献   

17.
The presence of numerous proteomics data and their results in literature reveal the importance and influence of proteins and peptides on human cell cycle. For instance, the proteomic profiling of biological samples, such as serum, plasma or cells, and their organelles, carried out by surface-enhanced laser desorption/ionization mass spectrometry, has led to the discovery of numerous key proteins involved in many biological disease processes. However, questions still remain regarding the reproducibility, bioinformatic artifacts and cross-validations of such experimental set-ups. The authors have developed a material-based approach, termed material-enhanced laser desorption/ionization mass spectrometry (MELDI-MS), to facilitate and improve the robustness of large-scale proteomic experiments. MELDI-MS includes a fully automated protein-profiling platform, from sample preparation and analysis to data processing involving state-of-the-art methods, which can be further improved. Multiplexed protein pattern analysis, based on material morphology, physical characteristics and chemical functionalities provides a multitude of protein patterns and allows prostate cancer samples to be distinguished from non-prostate cancer samples. Furthermore, MELDI-MS enables not only the analysis of protein signatures, but also the identification of potential discriminating peaks via capillary liquid chromatography mass spectrometry. The optimized MELDI approach offers a complete proteomics platform with improved sensitivity, selectivity and short sample preparation times.  相似文献   

18.
Improved biomarkers of acute nephrotoxicity are coveted by the drug development industry, regulatory agencies, and clinicians. In an effort to identify such biomarkers, urinary peptide profiles of rats treated with two different nephrotoxins were investigated. 493 marker candidates were defined that showed a significant response to cis-platin comparing a cis-platin treated cohort to controls. Next, urine samples from rats that received three consecutive daily doses of 150 or 300 mg/kg gentamicin were examined. 557 potential biomarkers were initially identified; 108 of these gentamicin-response markers showed a clear temporal response to treatment. 39 of the cisplatin-response markers also displayed a clear response to gentamicin. Of the combined 147 peptides, 101 were similarly regulated by gentamicin or cis-platin and 54 could be identified by tandem mass spectrometry. Most were collagen type I and type III fragments up-regulated in response to gentamicin treatment. Based on these peptides, classification models were generated and validated in a longitudinal study. In agreement with histopathology, the observed changes in classification scores were transient, initiated after the first dose, and generally persistent over a period of 10-20 days before returning to control levels. The data support the hypothesis that gentamicin-induced renal toxicity up-regulates protease activity, resulting in an increase in several specific urinary collagen fragments. Urinary proteomic biomarkers identified here, especially those common to both nephrotoxins, may serve as a valuable tool to investigate potential new drug candidates for the risk of nephrotoxicity.  相似文献   

19.
Direct tissue profiling and imaging mass spectrometry (MS) provides a detailed assessment of the complex protein pattern within a tissue sample. MALDI MS analysis of thin tissue sections results in over of 500 individual protein signals in the mass range of 2 to 70 kDa that directly correlate with protein composition within a specific region of the tissue sample. To date, profiling and imaging MS has been applied to multiple diseased tissues, including human gliomas and nonsmall cell lung cancer. Interrogation of the resulting complex MS data sets has resulted in identification of both disease-state and patient-prognosis specific protein patterns. These results suggest the future usefulness of proteomic information in assessing disease progression, prognosis, and drug efficacy.  相似文献   

20.
Different aspects of matrix-assisted laser desorption/ionization (MALDI) imaging mass spectrometry (IMS) have been used as discovery tools to obtain global and time-correlated information on the local proteomic composition of the sexually mature mouse epididymis from both qualitative and semiquantitative points of view. Tissue sections and laser captured microdissected cells and secretory products were analyzed by MALDI-MS and from the recovered protein profiles, over 400 different proteins were monitored. Over 50 of these, some of which have been identified, displayed regionalized behavior from caput to cauda within the epididymis. Combining the information obtained from high-resolution imaging mass spectrometry and laser captured microdissection experiments, numerous proteins were localized within the epididymis at the cellular level. Furthermore, from the signal intensities observed in the different protein profiles organized in space, semiquantitative information for each protein was obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号