首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recent developments in mass-spectrometry-based shotgun proteomics, especially methods using spectral counting, have enabled large-scale identification and differential profiling of complex proteomes. Most such proteomic studies are interested in identifying proteins, the abundance of which is different under various conditions. Several quantitative methods have recently been proposed and implemented for this purpose. Building on some techniques that are now widely accepted in the microarray literature, we developed and implemented a new method using a Bayesian model to calculate posterior probabilities of differential abundance for thousands of proteins in a given experiment simultaneously. Our Bayesian model is shown to deliver uniformly superior performance when compared with several existing methods.  相似文献   

2.
3.
Computational analysis of shotgun proteomics data   总被引:2,自引:0,他引:2  
Proteomics technology is progressing at an incredible rate. The latest generation of tandem mass spectrometers can now acquire tens of thousands of fragmentation spectra in a matter of hours. Furthermore, quantitative proteomics methods have been developed that incorporate a stable isotope-labeled internal standard for every peptide within a complex protein mixture for the measurement of relative protein abundances. These developments have opened the doors for 'shotgun' proteomics, yet have also placed a burden on the computational approaches that manage the data. With each new method that is developed, the quantity of data that can be derived from a single experiment increases. To deal with this increase, new computational approaches are being developed to manage the data and assess false positives. This review discusses current approaches for analyzing proteomics data by mass spectrometry and identifies present computational limitations and bottlenecks.  相似文献   

4.
Recent studies have revealed a relationship between protein abundance and sampling statistics, such as sequence coverage, peptide count, and spectral count, in label-free liquid chromatography-tandem mass spectrometry (LC-MS/MS) shotgun proteomics. The use of sampling statistics offers a promising method of measuring relative protein abundance and detecting differentially expressed or coexpressed proteins. We performed a systematic analysis of various approaches to quantifying differential protein expression in eukaryotic Saccharomyces cerevisiae and prokaryotic Rhodopseudomonas palustris label-free LC-MS/MS data. First, we showed that, among three sampling statistics, the spectral count has the highest technical reproducibility, followed by the less-reproducible peptide count and relatively nonreproducible sequence coverage. Second, we used spectral count statistics to measure differential protein expression in pairwise experiments using five statistical tests: Fisher's exact test, G-test, AC test, t-test, and LPE test. Given the S. cerevisiae data set with spiked proteins as a benchmark and the false positive rate as a metric, our evaluation suggested that the Fisher's exact test, G-test, and AC test can be used when the number of replications is limited (one or two), whereas the t-test is useful with three or more replicates available. Third, we generalized the G-test to increase the sensitivity of detecting differential protein expression under multiple experimental conditions. Out of 1622 identified R. palustris proteins in the LC-MS/MS experiment, the generalized G-test detected 1119 differentially expressed proteins under six growth conditions. Finally, we studied correlated expression of these 1119 proteins by analyzing pairwise expression correlations and by delineating protein clusters according to expression patterns. Through pairwise expression correlation analysis, we demonstrated that proteins co-located in the same operon were much more strongly coexpressed than those from different operons. Combining cluster analysis with existing protein functional annotations, we identified six protein clusters with known biological significance. In summary, the proposed generalized G-test using spectral count sampling statistics is a viable methodology for robust quantification of relative protein abundance and for sensitive detection of biologically significant differential protein expression under multiple experimental conditions in label-free shotgun proteomics.  相似文献   

5.
Measurements of mass spectral peak intensities and spectral counts are promising methods for quantifying protein abundance changes in shotgun proteomic analyses. We describe Serac, software developed to evaluate the ability of each method to quantify relative changes in protein abundance. Dynamic range and linearity using a three-dimensional ion trap were tested using standard proteins spiked into a complex sample. Linearity and good agreement between observed versus expected protein ratios were obtained after normalization and background subtraction of peak area intensity measurements and correction of spectral counts to eliminate discontinuity in ratio estimates. Peak intensity values useful for protein quantitation ranged from 10(7) to 10(11) counts with no obvious saturation effect, and proteins in replicate samples showed variations of less than 2-fold within the 95% range (+/-2sigma) when >or=3 peptides/protein were shared between samples. Protein ratios were determined with high confidence from spectral counts when maximum spectral counts were >or=4 spectra/protein, and replicates showed equivalent measurements well within 95% confidence limits. In further tests, complex samples were separated by gel exclusion chromatography, quantifying changes in protein abundance between different fractions. Linear behavior of peak area intensity measurements was obtained for peptides from proteins in different fractions. Protein ratios determined by spectral counting agreed well with those determined from peak area intensity measurements, and both agreed with independent measurements based on gel staining intensities. Overall spectral counting proved to be a more sensitive method for detecting proteins that undergo changes in abundance, whereas peak area intensity measurements yielded more accurate estimates of protein ratios. Finally these methods were used to analyze differential changes in protein expression in human erythroleukemia K562 cells stimulated under conditions that promote cell differentiation by mitogen-activated protein kinase pathway activation. Protein changes identified with p<0.1 showed good correlations with parallel measurements of changes in mRNA expression.  相似文献   

6.
This review will examine the current situation with label-free, quantitative, shotgun-oriented proteomics technology and discuss the advantages and limitations associated with its capability in capturing and quantifying large portions of proteomes of microorganisms. Such an approach allows (1) comparisons between physiological or genetic states of organisms at the protein level, (2) ‘painting’ of proteomic data onto genome data-based metabolic maps, (3) enhancement of the utility of genomic data and finally (4) surveying of non-genome sequenced microorganisms by taking advantage of available inferred protein data in order to gain new insights into strain-dependent metabolic or physiological capacities. The technology essentially is a powerful addition to systems biology with a capacity to be used to ask hypothesis-driven ‘top-down’ questions or for more empirical ‘bottom-up’ exploration.  相似文献   

7.
A new result report for Mascot search results is described. A greedy set cover algorithm is used to create a minimal set of proteins, which is then grouped into families on the basis of shared peptide matches. Protein families with multiple members are represented by dendrograms, generated by hierarchical clustering using the score of the nonshared peptide matches as a distance metric. The peptide matches to the proteins in a family can be compared side by side to assess the experimental evidence for each protein. If the evidence for a particular family member is considered inadequate, the dendrogram can be cut to reduce the number of distinct family members.  相似文献   

8.
We describe Abacus, a computational tool for extracting spectral counts from MS/MS data sets. The program aggregates data from multiple experiments, adjusts spectral counts to accurately account for peptides shared across multiple proteins, and performs common normalization steps. It can also output the spectral count data at the gene level, thus simplifying the integration and comparison between gene and protein expression data. Abacus is compatible with the widely used Trans-Proteomic Pipeline suite of tools and comes with a graphical user interface making it easy to interact with the program. The main aim of Abacus is to streamline the analysis of spectral count data by providing an automated, easy to use solution for extracting this information from proteomic data sets for subsequent, more sophisticated statistical analysis.  相似文献   

9.
Tandem mass spectrometry allows for fast protein identification in a complex sample. As mass spectrometers get faster, more sensitive and more accurate, methods were devised by many academic research groups and commercial suppliers that allow protein research also in quantitative respect. Since label-free methods are an attractive alternative to labeling approaches for proteomics researchers seeking for accurate quantitative results we evaluated several open-source analysis tools in terms of performance on two reference data sets, explicitly generated for this purpose.In this paper we present an implementation, T3PQ (Top 3 Protein Quantification), of the method suggested by Silva and colleagues for LC-MSE applications and we demonstrate its applicability to data generated on FT-ICR instruments acquiring in data dependent acquisition (DDA) mode. In order to validate this method and to show its usefulness also for absolute protein quantification, we generated a reference data set of a sample containing four different proteins with known concentrations. Furthermore, we compare three other label-free quantification methods using a complex biological sample spiked with a standard protein in defined concentrations. We evaluate the applicability of these methods and the quality of the results in terms of robustness and dynamic range of the spiked-in protein as well as other proteins also detected in the mixture. We discuss drawbacks of each method individually and consider crucial points for experimental designs. The source code of our implementation is available under the terms of the GNU GPLv3 and can be downloaded from sourceforge (http://fqms.svn.sourceforge.net/svnroot/fqms). A tarball containing the data used for the evaluation is available on the FGCZ web server (http://fgcz-data.uzh.ch/public/T3PQ.tgz).  相似文献   

10.

Background  

The low concentration and highly hydrophobic nature of proteins in lipid raft samples present significant challenges for the sensitive and accurate proteomic analyses of lipid raft proteins. Elimination of highly enriched lipids and interfering substances from raft samples is generally required before mass spectrometric analyses can be performed, but these procedures often lead to excessive protein loss and increased sample variability. For accurate analyses of the raft proteome, simplified protocols are needed to avoid excessive sample handling and purification steps.  相似文献   

11.
12.
13.
Orthogonal analysis of amino acid substitutions as a result of SNPs in existing proteomic datasets provides a critical foundation for the emerging field of population-based proteomics. Large-scale proteomics datasets, derived from shotgun tandem MS analysis of complex cellular protein mixtures, contain many unassigned spectra that may correspond to alternate alleles coded by SNPs. The purpose of this work was to identify tandem MS spectra in LC-MS/MS shotgun proteomics datasets that may represent coding nonsynonymous SNPs (nsSNP). To this end, we generated a tryptic peptide database created from allelic information found in NCBI's dbSNP. We searched this database with tandem MS spectra of tryptic peptides from DU4475 breast tumor cells that had been fractioned by pI in the first-dimension and reverse-phase LC in the second dimension. In all we identified 629 nsSNPs, of which 36 were of alternate SNP alleles not found in the reference NCBI or IPI protein databases. Searches for SNP-peptides carry a high risk of false positives due both to mass shifts caused by modifications and because of multiple representations of the same peptide within the genome. In this work, false positives were filtered using a novel peptide pI prediction algorithm and characterized using a decoy database developed by random substitution of similarly sized reference peptides. Secondary validation by sequencing of corresponding genomic DNA confirmed the presence of the predicted SNP in 8 of 10 SNP-peptides. This work highlights that the usefulness of interpreting unassigned spectra as polymorphisms is highly reliant on the ability to detect and filter false positives.  相似文献   

14.
15.
Mass spectrometry-based approaches are commonly used to identify proteins from multiprotein complexes, typically with the goal of identifying new complex members or identifying post-translational modifications. However, with the recent demonstration that spectral counting is a powerful quantitative proteomic approach, the analysis of multiprotein complexes by mass spectrometry can be reconsidered in certain cases. Using the chromatography-based approach named multidimensional protein identification technology, multiprotein complexes may be analyzed quantitatively using the normalized spectral abundance factor that allows comparison of multiple independent analyses of samples. This study describes an approach to visualize multiprotein complex datasets that provides structure function information that is superior to tabular lists of data. In this method review, we describe a reanalysis of the Rpd3/Sin3 small and large histone deacetylase complexes previously described in a tabular form to demonstrate the normalized spectral abundance factor approach.  相似文献   

16.
Dilated cardiomyopathy (DCM) is characterized by contractile dysfunction leading to heart failure. The molecular changes in the human heart associated with this disease have so far mostly been addressed at the gene expression level and only a few studies have analyzed global changes in the myocardial proteome. Therefore, our objective was to investigate the changes in the proteome in patients suffering from inflammatory DCM (iDCM) and chronic viral infection by a comprehensive quantitative approach. Comparative proteomic profiling of endomyocardial biopsies (EMB) from 10 patients with iDCM (left ventricular ejection fraction <40%, symptoms of heart failure) as well as 7 controls with normal left ventricular function and histology was performed by label-free proteome analysis (LC-MS/MS). Mass spectrometric data were analyzed with the Rosetta Elucidator software package. The analysis covered a total of 485 proteins. Among the 174 proteins displaying at least a 1.3-fold change in intensity (p < 0.05), major changes were observed for mitochondrial and cytoskeletal proteins, but also metabolic pathways were affected in iDCM compared to controls. In iDCM patients, we observed decreased levels of mitochondrial proteins involved in oxidative phosphorylation and tricarboxylic acid cycle. Furthermore, deregulation of proteins of carbohydrate metabolism, the actin cytoskeleton, and extracellular matrix remodeling was observed. Proteomic observations were confirmed by gene expression data and immunohistochemistry (e.g. collagen I and VI). This study demonstrates that label-free, mass spectrometry-centered approaches can identify disease dependent alterations in the proteome from small tissue samples such as endomyocardial biopsies. Thus, this technique might allow better disease characterization and may be a valuable tool in potential clinical proteomic studies.  相似文献   

17.
The emergence of shotgun proteomics has facilitated the numerous biological discoveries made by proteomic studies. However, comprehensive proteomic analysis remains challenging and shotgun proteomics is a continually changing field. This review details the recent developments in shotgun proteomics and describes emerging technologies that will influence shotgun proteomics going forward. In addition, proteomic studies of integral membrane proteins remain challenging due to the hydrophobic nature in integral membrane proteins and their general low abundance levels. However, there have been many strategies developed for enriching, isolating and separating membrane proteins for proteomic analysis that have moved this field forward. In summary, while shotgun proteomics is a widely used and mature technology, the continued pace of improvements in mass spectrometry and proteomic technology and methods indicate that future studies will have an even greater impact on biological discovery.  相似文献   

18.
Biological systems are in a continual state of flux, which necessitates an understanding of the dynamic nature of protein abundances. The study of protein abundance dynamics has become feasible with recent improvements in mass spectrometry-based quantitative proteomics. However, a number of challenges still remain related to how best to extract biological information from dynamic proteomics data, for example, challenges related to extraneous variability, missing abundance values, and the identification of significant temporal patterns. This paper describes a strategy that addresses these issues and demonstrates its values for analyzing temporal bottom-up proteomics data using data from a Rhodobacter sphaeroides 2.4.1 time-course study.  相似文献   

19.
20.
Normalized spectral index quantification was recently presented as an accurate method of label‐free quantitation, which improved spectral counting by incorporating the intensities of peptide MS/MS fragment ions into the calculation of protein abundance. We present SINQ, a tool implementing this method within the framework of existing analysis software, our freely available central proteomics facilities pipeline (CPFP). We demonstrate, using data sets of protein standards acquired on a variety of mass spectrometers, that SINQ can rapidly provide useful estimates of the absolute quantity of proteins present in a medium‐complexity sample. In addition, relative quantitation of standard proteins spiked into a complex lysate background and run without pre‐fractionation produces accurate results at amounts above 1 fmol on column. We compare quantitation performance to various precursor intensity‐ and identification‐based methods, including the normalized spectral abundance factor (NSAF), exponentially modified protein abundance index (emPAI), MaxQuant, and Progenesis LC‐MS. We anticipate that the SINQ tool will be a useful asset for core facilities and individual laboratories that wish to produce quantitative MS data, but lack the necessary manpower to routinely support more complicated software workflows. SINQ is freely available to obtain and use as part of the central proteomics facilities pipeline, which is released under an open‐source license.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号