首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The main goal of many proteomics experiments is an accurate and rapid quantification and identification of regulated proteins in complex biological samples. The bottleneck in quantitative proteomics remains the availability of efficient software to evaluate and quantify the tremendous amount of mass spectral data acquired during a proteomics project. A new software suite, ICPLQuant, has been developed to accurately quantify isotope‐coded protein label (ICPL)‐labeled peptides on the MS level during LC‐MALDI and peptide mass fingerprint experiments. The tool is able to generate a list of differentially regulated peptide precursors for subsequent MS/MS experiments, minimizing time‐consuming acquisition and interpretation of MS/MS data. ICPLQuant is based on two independent units. Unit 1 performs ICPL multiplex detection and quantification and proposes peptides to be identified by MS/MS. Unit 2 combines MASCOT MS/MS protein identification with the quantitative data and produces a protein/peptide list with all the relevant information accessible for further data mining. The accuracy of quantification, selection of peptides for MS/MS‐identification and the automated output of a protein list of regulated proteins are demonstrated by the comparative analysis of four different mixtures of three proteins (Ovalbumin, Horseradish Peroxidase and Rabbit Albumin) spiked into the complex protein background of the DGPF Proteome Marker.  相似文献   

2.
3.
Glycosylation modifies the physicochemical properties and protein binding functions of glycoconjugates. These modifications are biosynthesized in the endoplasmic reticulum and Golgi apparatus by a series of enzymatic transformations that are under complex control. As a result, mature glycans on a given site are heterogeneous mixtures of glycoforms. This gives rise to a spectrum of adhesive properties that strongly influences interactions with binding partners and resultant biological effects. In order to understand the roles glycosylation plays in normal and disease processes, efficient structural analysis tools are necessary. In the field of glycomics, liquid chromatography/mass spectrometry (LC/MS) is used to profile the glycans present in a given sample. This technology enables comparison of glycan compositions and abundances among different biological samples, i.e. normal versus disease, normal versus mutant, etc. Manual analysis of the glycan profiling LC/MS data is extremely time-consuming and efficient software tools are needed to eliminate this bottleneck. In this work, we have developed a tool to computationally model LC/MS data to enable efficient profiling of glycans. Using LC/MS data deconvoluted by Decon2LS/DeconTools, we built a list of unique neutral masses corresponding to candidate glycan compositions summarized over their various charge states, adducts and range of elution times. Our work aims to provide confident identification of true compounds in complex data sets that are not amenable to manual interpretation. This capability is an essential part of glycomics work flows. We demonstrate this tool, GlycReSoft, using an LC/MS dataset on tissue derived heparan sulfate oligosaccharides. The software, code and a test data set are publically archived under an open source license.  相似文献   

4.
One of the major bottlenecks in the proteomics field today resides in the computational interpretation of the massive data generated by the latest generation of high‐throughput MS instruments. MS/MS datasets are constantly increasing in size and complexity and it becomes challenging to comprehensively process such huge datasets and afterwards deduce most relevant biological information. The Mass Spectrometry Data Analysis (MSDA, https://msda.unistra.fr ) online software suite provides a series of modules for in‐depth MS/MS data analysis. It includes a custom databases generation toolbox, modules for filtering and extracting high‐quality spectra, for running high‐performance database and de novo searches, and for extracting modified peptides spectra and functional annotations. Additionally, MSDA enables running the most computationally intensive steps, namely database and de novo searches, on a computer grid thus providing a net time gain of up to 99% for data processing.  相似文献   

5.
Reversed-phase liquid chromatography (LC) directly coupled with electrospray-tandem mass spectrometry (MS/MS) is a successful choice to obtain a large number of product ion spectra from a complex peptide mixture. We describe a search validation program, ScoreRidge, developed for analysis of LC-MS/MS data. The program validates peptide assignments to product ion spectra resulting from usual probability-based searches against primary structure databases. The validation is based only on correlation between the measured LC elution time of each peptide and the deduced elution time from the amino acid sequence assigned to product ion spectra obtained from the MS/MS analysis of the peptide. Sufficient numbers of probable assignments gave a highly correlative curve. Any peptide assignments within a certain tolerance from the correlation curve were accepted for the following arrangement step to list identified proteins. Using this data validation program, host protein candidates responsible for interaction with human hepatitis B virus core protein were identified from a partially purified protein mixture. The present simple and practical program complements protein identification from usual product ion search algorithms and reduces manual interpretation of the search result data. It will lead to more explicit protein identification from complex peptide mixtures such as whole proteome digests from tissue samples.  相似文献   

6.
LC combined with MS/MS analysis of complex mixtures of protein digests is a reliable and sensitive method for characterization of protein phosphorylation. Peptide retention times (RTs) measured during an LC‐MS/MS run depend on both the peptide sequence and the location of modified amino acids. These RTs can be predicted using the LC of biomacromolecules at critical conditions model (BioLCCC). Comparing the observed RTs to those obtained from the BioLCCC model can provide additional validation of MS/MS‐based peptide identifications to reduce the false discovery rate and to improve the reliability of phosphoproteome profiling. In this study, energies of interaction between phosphorylated residues and the surface of RP separation media for both “classic” alkyl C18 and polar‐embedded C18 stationary phases were experimentally determined and included in the BioLCCC model extended for phosphopeptide analysis. The RTs for phosphorylated peptides and their nonphosphorylated analogs were predicted using the extended BioLCCC model and compared with their experimental RTs. The extended model was evaluated using literary data and a complex phosphoproteome data set distributed through the Association of Biomolecular Resource Facilities Proteome Informatics Research Group 2010 study. The reported results demonstrate the capability of the extended BioLCCC model to predict RTs which may lead to improved sensitivity and reliability of LC‐MS/MS‐based phosphoproteome profiling.  相似文献   

7.
Orthogonal analysis of amino acid substitutions as a result of SNPs in existing proteomic datasets provides a critical foundation for the emerging field of population-based proteomics. Large-scale proteomics datasets, derived from shotgun tandem MS analysis of complex cellular protein mixtures, contain many unassigned spectra that may correspond to alternate alleles coded by SNPs. The purpose of this work was to identify tandem MS spectra in LC-MS/MS shotgun proteomics datasets that may represent coding nonsynonymous SNPs (nsSNP). To this end, we generated a tryptic peptide database created from allelic information found in NCBI's dbSNP. We searched this database with tandem MS spectra of tryptic peptides from DU4475 breast tumor cells that had been fractioned by pI in the first-dimension and reverse-phase LC in the second dimension. In all we identified 629 nsSNPs, of which 36 were of alternate SNP alleles not found in the reference NCBI or IPI protein databases. Searches for SNP-peptides carry a high risk of false positives due both to mass shifts caused by modifications and because of multiple representations of the same peptide within the genome. In this work, false positives were filtered using a novel peptide pI prediction algorithm and characterized using a decoy database developed by random substitution of similarly sized reference peptides. Secondary validation by sequencing of corresponding genomic DNA confirmed the presence of the predicted SNP in 8 of 10 SNP-peptides. This work highlights that the usefulness of interpreting unassigned spectra as polymorphisms is highly reliant on the ability to detect and filter false positives.  相似文献   

8.
Time-consuming and experience-dependent manual validations of tandem mass spectra are usually applied to SEQUEST results. This inefficient method has become a significant bottleneck for MS/MS data processing. Here we introduce a program AMASS (advanced mass spectrum screener), which can filter the tandem mass spectra of SEQUEST results by measuring the match percentage of high-abundant ions and the continuity of matched fragment ions in b, y series. Compared with Xcorr and DeltaCn filter, AMASS can increase the number of positives and reduce the number of negatives in 22 datasets generated from 18 known protein mixtures. It effectively removed most noisy spectra, false interpretations, and about half of poor fragmentation spectra, and AMASS can work synergistically with Rscore filter. We believe the use of AMASS and Rscore can result in a more accurate identification of peptide MS/MS spectra and reduce the time and energy for manual validation.  相似文献   

9.
Label-free quantification of high mass resolution LC-MS data has emerged as a promising technology for proteome analysis. Computational methods are required for the accurate extraction of peptide signals from LC-MS data and the tracking of these features across the measurements of different samples. We present here an open source software tool, SuperHirn, that comprises a set of modules to process LC-MS data acquired on a high resolution mass spectrometer. The program includes newly developed functionalities to analyze LC-MS data such as feature extraction and quantification, LC-MS similarity analysis, LC-MS alignment of multiple datasets, and intensity normalization. These program routines extract profiles of measured features and comprise tools for clustering and classification analysis of the profiles. SuperHirn was applied in an MS1-based profiling approach to a benchmark LC-MS dataset of complex protein mixtures with defined concentration changes. We show that the program automatically detects profiling trends in an unsupervised manner and is able to associate proteins to their correct theoretical dilution profile.  相似文献   

10.
Whole-cell protein quantification using MS has proven to be a challenging task. Detection efficiency varies significantly from peptide to peptide, molecular identities are not evident a priori, and peptides are dispersed unevenly throughout the multidimensional data space. To overcome these challenges we developed an open-source software package, MapQuant, to quantify comprehensively organic species detected in large MS datasets. MapQuant treats an LC/MS experiment as an image and utilizes standard image processing techniques to perform noise filtering, watershed segmentation, peak finding, peak fitting, peak clustering, charge-state determination and carbon-content estimation. MapQuant reports abundance values that respond linearly with the amount of sample analyzed on both low- and high-resolution instruments (over a 1000-fold dynamic range). Background noise added to a sample, either as a medium-complexity peptide mixture or as a high-complexity trypsinized proteome, exerts negligible effects on the abundance values reported by MapQuant and with coefficients of variance comparable to other methods. Finally, MapQuant's ability to define accurate mass and retention time features of isotopic clusters on a high-resolution mass spectrometer can increase protein sequence coverage by assigning sequence identities to observed isotopic clusters without corresponding MS/MS data.  相似文献   

11.
The quantification of changes in protein abundance in complex biological specimens is essential for proteomic studies in basic and applied research. Here we report on the development and validation of the DeepQuanTR software for identification and quantification of differentially expressed proteins using LC‐MALDI‐MS. Following enzymatic digestion, HPLC peptide separation and normalization of MALDI‐MS signal intensities to the ones of internal standards, the software extracts peptide features, adjusts differences in HPLC retention times and performs a relative quantification of features. The annotation of multiple peptides to the corresponding parent protein allows the definition of a Protein Quant Value, which is related to protein abundance and which allows inter‐sample comparisons. The performance of DeepQuanTR was evaluated by analyzing 24 samples deriving from human serum spiked with different amounts of four proteins and eight complex samples of vascular proteins, derived from surgically resected human kidneys with cancer following ex vivo perfusion with a reactive ester biotin derivative. The identification and experimental validation of proteins, which were differentially regulated in cancerous lesions as compared with normal kidney, was used to demonstrate the power of DeepQuanTR. This software, which can easily be used with established proteomic methodologies, facilitates the relative quantification of proteins derived from a wide variety of different samples.  相似文献   

12.
Current efforts aimed at developing high-throughput proteomics focus on increasing the speed of protein identification. Although improvements in sample separation, enrichment, automated handling, mass spectrometric analysis, as well as data reduction and database interrogation strategies have done much to increase the quality, quantity and efficiency of data collection, significant bottlenecks still exist. Various separation techniques have been coupled with tandem mass spectrometric (MS/MS) approaches to allow a quicker analysis of complex mixtures of proteins, especially where a high number of unambiguous protein identifications are the exception, rather than the rule. MS/MS is required to provide structural / amino acid sequence information on a peptide and thus allow protein identity to be inferred from individual peptides. Currently these spectra need to be manually validated because: (a) the potential of false positive matches i.e., protein not in database, and (b) observed fragmentation trends may not be incorporated into current MS/MS search algorithms. This validation represents a significant bottleneck associated with high-throughput proteomic strategies. We have developed CHOMPER, a software program which reduces the time required to both visualize and confirm MS/MS search results and generate post-analysis reports and protein summary tables. CHOMPER extracts the identification information from SEQUEST MS/MS search result files, reproduces both the peptide and protein identification summaries, provides a more interactive visualization of the MS/MS spectra and facilitates the direct submission of manually validated identifications to a database.  相似文献   

13.
Liu BA  Engelmann BW  Nash PD 《Proteomics》2012,12(10):1527-1546
Modular protein interaction domains (PIDs) that recognize linear peptide motifs are found in hundreds of proteins within the human genome. Some PIDs such as SH2, 14-3-3, Chromo, and Bromo domains serve to recognize posttranslational modification (PTM) of amino acids (such as phosphorylation, acetylation, methylation, etc.) and translate these into discrete cellular responses. Other modules such as SH3 and PSD-95/Discs-large/ZO-1 (PDZ) domains recognize linear peptide epitopes and serve to organize protein complexes based on localization and regions of elevated concentration. In both cases, the ability to nucleate-specific signaling complexes is in large part dependent on the selectivity of a given protein module for its cognate peptide ligand. High-throughput (HTP) analysis of peptide-binding domains by peptide or protein arrays, phage display, mass spectrometry, or other HTP techniques provides new insight into the potential protein-protein interactions prescribed by individual or even whole families of modules. Systems level analyses have also promoted a deeper understanding of the underlying principles that govern selective protein-protein interactions and how selectivity evolves. Lastly, there is a growing appreciation for the limitations and potential pitfalls associated with HTP analysis of protein-peptide interactomes. This review will examine some of the common approaches utilized for large-scale studies of PIDs and suggest a set of standards for the analysis and validation of datasets from large-scale studies of peptide-binding modules. We will also highlight how data from large-scale studies of modular interaction domain families can provide insight into systems level properties such as the linguistics of selective interactions.  相似文献   

14.
MOTIVATION: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). RESULTS: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com.  相似文献   

15.
We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database(YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a singlelaboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography–tandem mass spectrometry(LC–MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring(MRM)/selective reaction monitoring(SRM) assay development. We have linked YPED's database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results.  相似文献   

16.
A novel software tool named PTM-Explorer has been applied to LC-MS/MS datasets acquired within the Human Proteome Organisation (HUPO) Brain Proteome Project (BPP). PTM-Explorer enables automatic identification of peptide MS/MS spectra that were not explained in typical sequence database searches. The main focus was detection of PTMs, but PTM-Explorer detects also unspecific peptide cleavage, mass measurement errors, experimental modifications, amino acid substitutions, transpeptidation products and unknown mass shifts. To avoid a combinatorial problem the search is restricted to a set of selected protein sequences, which stem from previous protein identifications using a common sequence database search. Prior to application to the HUPO BPP data, PTM-Explorer was evaluated on excellently manually characterized and evaluated LC-MS/MS data sets from Alpha-A-Crystallin gel spots obtained from mouse eye lens. Besides various PTMs including phosphorylation, a wealth of experimental modifications and unspecific cleavage products were successfully detected, completing the primary structure information of the measured proteins. Our results indicate that a large amount of MS/MS spectra that currently remain unidentified in standard database searches contain valuable information that can only be elucidated using suitable software tools.  相似文献   

17.
We introduce the computer tool “Know Your Samples” (KYSS) for assessment and visualisation of large scale proteomics datasets, obtained by mass spectrometry (MS) experiments. KYSS facilitates the evaluation of sample preparation protocols, LC peptide separation, and MS and MS/MS performance by monitoring the number of missed cleavages, precursor ion charge states, number of protein identifications and peptide mass error in experiments. KYSS generates several different protein profiles based on protein abundances, and allows for comparative analysis of multiple experiments. KYSS was adapted for blood plasma proteomics and provides concentrations of identified plasma proteins. We demonstrate the utility of the KYSS tool for MS based proteome analysis of blood plasma and for assessment of hydrogel particles for depletion of abundant proteins in plasma. The KYSS software is open source and is freely available at http://kyssproject.github.io/.  相似文献   

18.
A very popular approach in proteomics is the so-called "shotgun LC-MS/MS" strategy. In its mostly used form, a total protein digest is separated by ion exchange fractionation in the first dimension followed by off- or on-line RP LC-MS/MS. We replaced the first dimension by isoelectric focusing in the liquid phase using the Off-Gel device producing 15 fractions. As peptides are separated by their isoelectric point in the first dimension and hydrophobicity in the second, those experimentally derived parameters (pI and R(T)) can be used for the validation of potentially identified peptides. We applied this strategy to a cellular extract of Drosophila Kc167 cells and identified peptides with two different database search engines, namely PHENYX and SEQUEST, with PeptideProphet validation of the SEQUEST results. PHENYX returned 7582 potential peptide identifications and SEQUEST 7629. The SEQUEST results were reduced to 2006 identifications by validation with PeptideProphet. Validation of the PeptideProphet, SEQUEST and PHENYX results by pI and R(T) parameters confirmed 1837 PeptideProphet identifications while in the remainder of the SEQUEST results another 1130 peptides were found to be likely hits. The validation on PHENYX resulted in the fixation of a solid p-value threshold of <1 x 10(-04) that sets by itself the correct identification confidence to >95%, and a final count of 2034 highly confident peptide identifications was achieved after pI and R(T) validation. Although the PeptideProphet and PHENYX datasets have a very high confidence the overlap of common identifications was only at 79.4%, to be explained by the fact that data interpretation was done searching different protein databases with two search engines of different algorithms. The approach used in this study allowed for an automated and improved data validation process for shotgun proteomics projects producing MS/MS peptide identification results of very high confidence.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号