首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Plasma Proteome Database as a resource for proteomics research   总被引:1,自引:0,他引:1  
Plasma is one of the best studied compartments in the human body and serves as an ideal body fluid for the diagnosis of diseases. This report provides a detailed functional annotation of all the plasma proteins identified to date. In all, gene products encoded by 3778 distinct genes were annotated based on proteins previously published in the literature as plasma proteins and the identification of multiple peptides from proteins under HUPO's Plasma Proteome Project. Our analysis revealed that 51% of these genes encoded more than one protein isoform. All single nucleotide polymorphisms involving protein-coding regions were mapped onto the protein sequences. We found a number of examples of isoform-specific subcellular localization as well as tissue expression. This database is an attempt at comprehensive annotation of a complex subproteome and is available on the web at http://www.plasmaproteomedatabase.org.  相似文献   

2.
3.
Beer I  Barnea E  Admon A 《Proteomics》2005,5(13):3491-3496
The human Plasma Proteome Project (PPP) is a large-scale collaboration between many laboratories. One of the most demanding tasks in the PPP involved the analysis of very large amounts of raw MS/MS data produced by the participants. The main approach for managing this task was letting the participants analyze their own data and submit the results to the central PPP repository as lists of identified proteins and peptides. To complement this distributed approach, we also performed centralized analysis of the raw MS/MS data provided by the participants. Due to the data redundancy inherent in such a project, centralized analysis has the potential to reduce the computational effort by reducing redundancy before the analysis. Centralized analysis can also unify the process and take advantage of data sharing among laboratories to improve protein identification and validation. The process we employed included removing low-quality spectra, clustering spectra by mutual similarity, and applying uniform peptide and protein identification procedures. To demonstrate the process, we analyzed 5.28 million MS/MS spectra derived by eight laboratories from tryptic peptides of serum and plasma proteins.  相似文献   

4.

Background

The immense diagnostic potential of human plasma has prompted great interest and effort in cataloging its contents, exemplified by the Human Proteome Organization (HUPO) Plasma Proteome Project (PPP) pilot project. Due to challenges in obtaining a reliable blood plasma protein list, HUPO later re-analysed their own original dataset with a more stringent statistical treatment that resulted in a much reduced list of high confidence (at least 95%) proteins compared with their original findings. In order to facilitate the discovery of novel biomarkers in the future and to realize the full diagnostic potential of blood plasma, we feel that there is still a need for an ultra-high confidence reference list (at least 99% confidence) of blood plasma proteins.

Methods

To address the complexity and dynamic protein concentration range of the plasma proteome, we employed a linear ion-trap-Fourier transform (LTQ-FT) and a linear ion trap-Orbitrap (LTQ-Orbitrap) for mass spectrometry (MS) analysis. Both instruments allow the measurement of peptide masses in the low ppm range. Furthermore, we employed a statistical score that allows database peptide identification searching using the products of two consecutive stages of tandem mass spectrometry (MS3). The combination of MS3 with very high mass accuracy in the parent peptide allows peptide identification with orders of magnitude more confidence than that typically achieved.

Results

Herein we established a high confidence set of 697 blood plasma proteins and achieved a high 'average sequence coverage' of more than 14 peptides per protein and a median of 6 peptides per protein. All proteins annotated as belonging to the immunoglobulin family as well as all hypothetical proteins whose peptides completely matched immunoglobulin sequences were excluded from this protein list. We also compared the results of using two high-end MS instruments as well as the use of various peptide and protein separation approaches. Furthermore, we characterized the plasma proteins using cellular localization information, as well as comparing our list of proteins to data from other sources, including the HUPO PPP dataset.

Conclusion

Superior instrumentation combined with rigorous validation criteria gave rise to a set of 697 plasma proteins in which we have very high confidence, demonstrated by an exceptionally low false peptide identification rate of 0.29%.  相似文献   

5.
One of the major challenges facing protein analysis is the dynamic range of protein expression within massively complex samples (Corthals, G. L. et al.., Electrophoresis 2000, 21, 1104-1115). In plasma this difference is as great as ten orders of magnitude, and this is currently beyond the range of detection achievable by any of the analytical techniques. Plasma has the additional challenge of having a few highly abundant proteins, such as albumin, which mask the detection of lower abundance and biologically significant proteins. The use of the Gradiflow BF400 as a fractionation tool to deplete highly abundant albumin from human plasma is reported here. A sequential three-step protocol was performed on five plasma samples as part of the International Plasma Proteome Project organised by the HUPO; four containing different anticoagulants: EDTA, citrate, heparin and a control sample (NIBSC); and a serum sample. Plasma from an alternate source also underwent fractionation and served as an in-house control. Time modulation between 1 and 7 h was observed for the depletion of albumin from these samples. Following albumin depletion, each fraction was trypsin-digested and the peptides were fractionated further using a 2-D LC-MS/MS. Differences in the total number of proteins identified for each sample were also noted.  相似文献   

6.
Applications of InterPro in protein annotation and genome analysis   总被引:2,自引:0,他引:2  
The applications of InterPro span a range of biologically important areas that includes automatic annotation of protein sequences and genome analysis. In automatic annotation of protein sequences InterPro has been utilised to provide reliable characterisation of sequences, identifying them as candidates for functional annotation. Rules based on the InterPro characterisation are stored and operated through a database called RuleBase. RuleBase is used as the main tool in the sequence database group at the EBI to apply automatic annotation to unknown sequences. The annotated sequences are stored and distributed in the TrEMBL protein sequence database. InterPro also provides a means to carry out statistical and comparative analyses of whole genomes. In the Proteome Analysis Database, InterPro analyses have been combined with other analyses based on CluSTr, the Gene Ontology (GO) and structural information on the proteins.  相似文献   

7.
Characterization of the human blood plasma proteome is critical to the discovery of routinely useful clinical biomarkers. We used an accurate mass and time (AMT) tag strategy with high-resolution mass accuracy cLC-FT-ICR MS to perform a global proteomic analysis of pilot study samples as part of the HUPO Plasma Proteome Project. HUPO reference serum and citrated plasma samples from African Americans, Asian Americans, and Caucasian Americans were analyzed, in addition to a Pacific Northwest National Laboratory reference serum and plasma. The AMT tag strategy allowed us to leverage two previously published "shotgun" proteomics experiments to perform global analyses on these samples in triplicate in less than 4 days total analysis time. A total of 722 (22% with multiple peptide identifications) International Protein Index redundant proteins, or 377 protein families by ProteinProphet, were identified over the six individual HUPO serum and plasma samples. The samples yielded a similar number of identified redundant proteins in the plasma samples (average 446 +/- 23) as found in the serum samples (average 440 +/- 20). These proteins were identified by an average of 956 +/- 35 unique peptides in plasma and 930 +/- 11 unique peptides in serum. In addition to this high-throughput analysis, the AMT tag approach was used with a Z-score normalization to compare relative protein abundances. This analysis highlighted both known differences in serum and citrated plasma such as fibrinogens, and reproducible differences in peptide abundances from proteins such as soluble activin receptor-like kinase 7b and glycoprotein m6b. The AMT tag strategy not only improved our sample throughput but also provided a basis for estimated quantitation.  相似文献   

8.
Through a multi-university and interdisciplinary project we have involved undergraduate biology and computer science research students in the functional annotation of maize genes and the analysis of their microarray expression patterns. We have created a database to house the results of our functional annotation of >4400 genes identified as being differentially regulated in the maize shoot apical meristem (SAM). This database is located at http://sam.truman.edu and is now available for public use. The undergraduate students involved in constructing this unique SAM database received hands-on training in an intellectually challenging environment, which has prepared them for graduate and professional careers in biological sciences. We describe our experiences with this project as a model for effective research-based teaching of undergraduate biology and computer science students, as well as for a rich professional development experience for faculty at predominantly undergraduate institutions.  相似文献   

9.
Since the advent of public data repositories for proteomics data, readily accessible results from high-throughput experiments have been accumulating steadily. Several large-scale projects in particular have contributed substantially to the amount of identifications available to the community. Despite the considerable body of information amassed, very few successful analyses have been performed and published on this data, leveling off the ultimate value of these projects far below their potential. A prominent reason published proteomics data is seldom reanalyzed lies in the heterogeneous nature of the original sample collection and the subsequent data recording and processing. To illustrate that at least part of this heterogeneity can be compensated for, we here apply a latent semantic analysis to the data contributed by the Human Proteome Organization's Plasma Proteome Project (HUPO PPP). Interestingly, despite the broad spectrum of instruments and methodologies applied in the HUPO PPP, our analysis reveals several obvious patterns that can be used to formulate concrete recommendations for optimizing proteomics project planning as well as the choice of technologies used in future experiments. It is clear from these results that the analysis of large bodies of publicly available proteomics data by noise-tolerant algorithms such as the latent semantic analysis holds great promise and is currently underexploited.  相似文献   

10.

Background  

Defining the location of genes and the precise nature of gene products remains a fundamental challenge in genome annotation. Interrogating tandem mass spectrometry data using genomic sequence provides an unbiased method to identify novel translation products. A six-frame translation of the entire human genome was used as the query database to search for novel blood proteins in the data from the Human Proteome Organization Plasma Proteome Project. Because this target database is orders of magnitude larger than the databases traditionally employed in tandem mass spectra analysis, careful attention to significance testing is required. Confidence of identification is assessed using our previously described Poisson statistic, which estimates the significance of multi-peptide identifications incorporating the length of the matching sequence, number of spectra searched and size of the target sequence database.  相似文献   

11.
12.
HUPO initiated the Plasma Proteome Project (PPP) in 2002. Its pilot phase has (1) evaluated advantages and limitations of many depletion, fractionation, and MS technology platforms; (2) compared PPP reference specimens of human serum and EDTA, heparin, and citrate-anti-coagulated plasma; and (3) created a publicly-available knowledge base (www.bioinformatics.med.umich.edu/hupo/ppp; www.ebi.ac.uk/pride). Thirty-five participating laboratories in 13 countries submitted datasets. Working groups addressed (a) specimen stability and protein concentrations; (b) protein identifications from 18 MS/MS datasets; (c) independent analyses from raw MS-MS spectra; (d) search engine performance, subproteome analyses, and biological insights; (e) antibody arrays; and (f) direct MS/SELDI analyses. MS-MS datasets had 15 710 different International Protein Index (IPI) protein IDs; our integration algorithm applied to multiple matches of peptide sequences yielded 9504 IPI proteins identified with one or more peptides and 3020 proteins identified with two or more peptides (the Core Dataset). These proteins have been characterized with Gene Ontology, InterPro, Novartis Atlas, OMIM, and immunoassay-based concentration determinations. The database permits examination of many other subsets, such as 1274 proteins identified with three or more peptides. Reverse protein to DNA matching identified proteins for 118 previously unidentified ORFs. We recommend use of plasma instead of serum, with EDTA (or citrate) for anticoagulation. To improve resolution, sensitivity and reproducibility of peptide identifications and protein matches, we recommend combinations of depletion, fractionation, and MS/MS technologies, with explicit criteria for evaluation of spectra, use of search algorithms, and integration of homologous protein matches. This Special Issue of PROTEOMICS presents papers integral to the collaborative analysis plus many reports of supplementary work on various aspects of the PPP workplan. These PPP results on complexity, dynamic range, incomplete sampling, false-positive matches, and integration of diverse datasets for plasma and serum proteins lay a foundation for development and validation of circulating protein biomarkers in health and disease.  相似文献   

13.
Omenn GS 《Proteomics》2004,4(5):1235-1240
A comprehensive, systematic characterization of cirolating proteins in health and disease will greatly facilitate development of biomarkers for prevention, diagnosis, and therapy of cancers and other diseases. The Human Proteome Organization Plasma Proteome Project pilot phase aims to (1) compare the advantages and limitations of many technology platforms; (2) contrast reference specimens of human plasma (ethylenediaminetetra acetic acid, heparin, citrate-anticoagulated) and serum, in terms of numbers of proteins identified and any interferences with various technology platforms; and (3) create a global knowledge base/data repository.  相似文献   

14.
The Human Proteome Organisation Brain Proteome Project aims at coordinating neuroproteomic activities with respect to analysis of development, aging, and evolution in human and mice and at analysing normal aging processes as well as neurodegenerative diseases. Our group participated in the mouse pilot study of this project using two different 2-DE systems, to find out the optimal conditions for comprehensive gel-based differential proteome analysis. Besides the assessment of the best methodical conditions the question of "How many biological replicate analyses have to be performed to get reliable statistically validated results?" was addressed. In total 420 differences were detected in all analyses. Both 2-DE methods were found to be suitable for comprehensive differential proteome analysis. Nevertheless, each of the methods showed substantial advantages and disadvantages resulting in the fact that modification of both systems is essential. From our results we can draw the conclusions that for the future optimal quantitative differential gel-based brain proteome analyses the sample preparation has to be slightly changed, the resolution of the first as well as the second dimension has to be advanced, the number of experiments has to be increased and that the 2D-DIGE system should be applied.  相似文献   

15.
Nasopharyngeal carcinoma (NPC), one of the most common cancers in Southeast Asia, is not easily diagnosed until advanced stages. To discover potential biomarkers for improving NPC diagnosis, we herein identified the aberrant plasma proteins in NPC patients. We first removed the top-seven abundant proteins from plasma samples of healthy controls and NPC patients, and then labeled the samples with different fluorescent cyanine dyes. The labeled samples were then mixed equally and fractionated with ion-exchange chromatography followed by SDS-PAGE. Proteins showing altered levels in NPC patients were identified by in-gel tryptic digestion and LC-MS/MS. When the biological roles of the 45 identified proteins were assessed via MetaCore? analysis, the blood coagulation pathway emerged as the most significantly altered pathway in NPC plasma. Plasma kallikrein (KLKB1) and thrombin-antithrombin III complex (TAT) were chosen for evaluation as the candidate NPC biomarkers because of their involvement in blood coagulation. ELISAs confirmed the elevation of their plasma levels in NPC patients versus healthy controls. Western blot and activity assays further showed that the KLKB1 active form was significantly increased in NPC plasma. Collectively, our results identified the significant alteration of blood coagulation pathway in NPC patients, and KLKB1 and TAT may represent the potential NPC biomarkers.  相似文献   

16.
Yamazaki Y  Okawa K  Yano T  Tsukita S  Tsukita S 《Biochemistry》2008,47(19):5378-5386
A high level of structural organization of functional membrane domains in very narrow regions of a plasma membrane is crucial for the functions of plasma membranes and various other cellular functions. Conventional proteomic analyses are based on total soluble cellular proteins. Thus, because of insolubility problems, they have major drawbacks for use in analyses of low-abundance proteins enriched in very limited and specific areas of cells, as well as in analyses of the membrane proteins in two-dimensional gels. We optimized proteomic analyses of cell-cell adhering junctional membrane proteins on gels. First, we increased the purity of cell-cell junctions, which are very limited and specific areas for cell-cell adhesion, from hepatic bile canaliculi. We then enriched junctional membrane proteins via a guanidine treatment; these became selectively detectable on two- dimensionally electrophoresed gels after treatment with an extremely high concentration of NP-40. The framework of major junctional integral membrane proteins was shown on gels. These included six novel junctional membrane proteins of type I, type II, and tetraspanin, which were identified by mass spectrometry and by a database sequence homology search, as well as 12 previously identified junctional membrane proteins, such as cadherins and claudins.  相似文献   

17.
The pilot phase of the HUPO Plasma Proteome Project (PPP) is an international collaboration to catalog the protein composition of human blood plasma and serum by analyzing standardized aliquots of reference serum and plasma specimens using a variety of experimental techniques. Data management for this project included collection, integration, analysis, and dissemination of findings from participating organizations world-wide. Accomplishing this task required a communication and coordination infrastructure specific enough to support meaningful integration of results from all participants, but flexible enough to react to changing requirements and new insights gained during the course of the project and to allow participants with varying informatics capabilities to contribute. Challenges included integrating heterogeneous data, reducing redundant information to minimal identification sets, and data annotation. Our data integration workflow assembles a minimal and representative set of protein identifications, which account for the contributed data. It accommodates incomplete concordance of results from different laboratories, ambiguity and redundancy in contributed identifications, and redundancy in the protein sequence databases. Recommendations of the PPP for future large-scale proteomics endeavors are described.  相似文献   

18.
19.
Psoriasis is a common chronic autoimmune skin disease involving the activation of T cells. To explore the proteomic signature of peripheral blood mononuclear cells, a quantitative analysis of their global proteome was conducted in samples from Chinese patients with new‐onset psoriasis (n = 31) and healthy controls (n = 32) using an integrated quantitative approach with tandem mass tag labeling and LC–MS/MS. Protein annotation, unsupervised hierarchical clustering, functional classification, functional enrichment and cluster, and protein–protein interaction analyses were performed. A total of 5178 proteins were identified, of which 4404 proteins were quantified. The fold‐change cutoff was set at 1.2 (patients vs controls); 335 proteins were upregulated, and 107 proteins were downregulated. The bioinformatics analysis indicated that the differentially expressed proteins were involved in processes related to the activation of immune cells including the nuclear factor kappa‐light‐chain‐enhancer of activated B cells (NF‐κB) pathway, cellular energy metabolism, and proliferation. Three upregulated proteins and two phosphorylated proteins in the NF‐κB pathway were verified or identified by Western blotting. These results confirm that the NF‐κB pathway is critical to psoriasis. In addition, many differentially expressed proteins identified in this study have never before been associated with psoriasis, and further studies on these proteins are necessary.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号