首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A new approach using targeted sequence collections has been developed for identifying endogenous peptides. This approach enables a fast, specific, and sensitive identification of endogenous peptides. Three different sequence collections were constituted in this study to mimic the peptidomic samples: SwePep precursors, SwePep peptides, and SwePep predicted. The searches for neuropeptides performed against these three sequence collections were compared with searches performed against the entire mouse proteome, which is commonly used to identify neuropeptides. These four sequence collections were searched with both Mascot and X! Tandem. Evaluation of the sequence collections was achieved using a set of manually identified and previously verified peptides. By using the three new sequence collections, which more accurately mimic the sample, 3 times as many peptides were significantly identified, with a false-positive rate below 1%, in comparison with the mouse proteome. The new sequence collections were also used to identify previously uncharacterized peptides from brain tissue; 27 previously uncharacterized peptides and potentially bioactive neuropeptides were identified. These novel peptides are cleaved from the peptide precursors at sites that are characteristic for prohormone convertases, and some of them have post-translational modifications that are characteristic for neuropeptides. The targeted protein sequence collections for different species are publicly available for download from SwePep.  相似文献   

2.
Human blood plasma can be obtained relatively noninvasively and contains proteins from most, if not all, tissues of the body. Therefore, an extensive, quantitative catalog of plasma proteins is an important starting point for the discovery of disease biomarkers. In 2005, we showed that different proteomics measurements using different sample preparation and analysis techniques identify significantly different sets of proteins, and that a comprehensive plasma proteome can be compiled only by combining data from many different experiments. Applying advanced computational methods developed for the analysis and integration of very large and diverse data sets generated by tandem MS measurements of tryptic peptides, we have now compiled a high-confidence human plasma proteome reference set with well over twice the identified proteins of previous high-confidence sets. It includes a hierarchy of protein identifications at different levels of redundancy following a clearly defined scheme, which we propose as a standard that can be applied to any proteomics data set to facilitate cross-proteome analyses. Further, to aid in development of blood-based diagnostics using techniques such as selected reaction monitoring, we provide a rough estimate of protein concentrations using spectral counting. We identified 20,433 distinct peptides, from which we inferred a highly nonredundant set of 1929 protein sequences at a false discovery rate of 1%. We have made this resource available via PeptideAtlas, a large, multiorganism, publicly accessible compendium of peptides identified in tandem MS experiments conducted by laboratories around the world.  相似文献   

3.
We utilized a setup based on extensive pre-fractionation of proteolytic peptides and nanoflow reversed-phase LC-MS/MS to identify the (sub)proteome of human follicular fluid (FF). In this in-depth screen, 246 specific proteins were identified, the majority of which are involved in coagulation- and immune-response pathways. Our aim is to define a set of FF protein markers, which could predict oocyte quality.  相似文献   

4.
A proteome of a model organism, Caenorhabditis elegans, was analyzed by an integrated liquid chromatography (LC)-based protein identification system, which was constructed by microscale two-dimensional liquid chromatography (2DLC) coupled with electrospray ionization (ESI) tandem mass spectrometry (MS/MS) on a high-resolution hybrid mass spectrometer with an automated data analysis system. Soluble and insoluble protein fractions were prepared from a mixed growth phase culture of the worm C. elegans, digested with trypsin, and fractionated separately on the 2DLC system. The separated peptides were directly analyzed by on-line ESI-MS/MS in a data-dependent mode, and the resultant spectral data were automatically processed to search a genome sequence database, wormpep 66, for protein identification. The total number of proteins of the composite proteome identified in this method was 1,616, including 110 secreted/targeted proteins and 242 transmembrane proteins. The codon adaptation indices of the identified proteins suggested that the system could identify proteins of relatively low abundance, which are difficult to identify by conventional 2D-gel electrophoresis (GE) followed by an offline mass spectrometric analysis such as peptide mass fingerprinting. Among the approximately 5,400 peptides assigned in this study, many peptides with post-translational modifications, such as N-terminal acetylation and phosphorylation, were detected. This expression profile of C. elegans, containing 571 hypothetical gene products, will serve as the basic data of a major proteome set expressed in the worm.  相似文献   

5.
Increasing evidence suggests that proteins present in the angiosperm sieve tube system play an important role in the long distance signaling system of plants. To identify the nature of these putatively non-cell-autonomous proteins, we adopted a large scale proteomics approach to analyze pumpkin phloem exudates. Phloem proteins were fractionated by fast protein liquid chromatography using both anion and cation exchange columns and then either in-solution or in-gel digested following further separation by SDS-PAGE. A total of 345 LC-MS/MS data sets were analyzed using a combination of Mascot and X!Tandem against the NCBI non-redundant green plant database and an extensive Cucurbit maxima expressed sequence tag database. In this analysis, 1,209 different consensi were obtained of which 1,121 could be annotated from GenBank and BLAST search analyses against three plant species, Arabidopsis thaliana, rice (Oryza sativa), and poplar (Populus trichocarpa). Gene ontology (GO) enrichment analyses identified sets of phloem proteins that function in RNA binding, mRNA translation, ubiquitin-mediated proteolysis, and macromolecular and vesicle trafficking. Our findings indicate that protein synthesis and turnover, processes that were thought to be absent in enucleate sieve elements, likely occur within the angiosperm phloem translocation stream. In addition, our GO analysis identified a set of phloem proteins that are associated with the GO term "embryonic development ending in seed dormancy"; this finding raises the intriguing question as to whether the phloem may exert some level of control over seed development. The universal significance of the phloem proteome was highlighted by conservation of the phloem proteome in species as diverse as monocots (rice), eudicots (Arabidopsis and pumpkin), and trees (poplar). These results are discussed from the perspective of the role played by the phloem proteome as an integral component of the whole plant communication system.  相似文献   

6.
Comprehensive proteome profiling of breast cancer tissue samples is challenging, as the tissue samples contain many proteins with varying concentrations and modifications. We report an effective sample preparation strategy combined with liquid chromatography (LC) electrospray ionization (ESI) quadrupole time-of-flight (QTOF) MS/MS for proteome analysis of human breast cancer tissue. The complexity of the breast cancer tissue proteome was reduced by using protein precipitation from a tissue extract, followed by sequential protein solubilization in solvents of different solubilizing strength. The individual fractions of protein mixtures or subproteomes were subjected to trypsin digestion and the resultant peptides were separated by strong cation exchange (SCX) chromatography, followed by reversed-phase capillary LC combined with high resolution and high accuracy ESI-QTOF MS/MS. This approach identified 14407 unique peptides from 3749 different proteins based on peptide matches with scores above the threshold scores at the 95% confidence level in MASCOT database search of the acquired MS/MS spectra. The false positive rate of peptide matches was determined to be 0.95% by using the target-decoy sequence search strategy. On the basis of gene ontology categorization, the identified proteins represented a wide variety of biological functions, cellular processes, and cellular locations.  相似文献   

7.
Automated multidimensional capillary liquid chromatography-tandem mass spectrometry (LC-MS/MS) has been increasingly applied in various large scale proteome profiling efforts. However, comprehensive global proteome analysis remains technically challenging due to issues associated with sample complexity and dynamic range of protein abundances, which is particularly apparent in mammalian biological systems. We report here the application of a high efficiency cysteinyl peptide enrichment (CPE) approach to the global proteome analysis of human mammary epithelial cells (HMECs) which significantly improved both sequence coverage of protein identifications and the overall proteome coverage. The cysteinyl peptides were specifically enriched by using a thiol-specific covalent resin, fractionated by strong cation exchange chromatography, and subsequently analyzed by reversed-phase capillary LC-MS/MS. An HMEC tryptic digest without CPE was also fractionated and analyzed under the same conditions for comparison. The combined analyses of HMEC tryptic digests with and without CPE resulted in a total of 14 416 confidently identified peptides covering 4294 different proteins with an estimated 10% gene coverage of the human genome. By using the high efficiency CPE, an additional 1096 relatively low abundance proteins were identified, resulting in 34.3% increase in proteome coverage; 1390 proteins were observed with increased sequence coverage. Comparative protein distribution analyses revealed that the CPE method is not biased with regard to protein M(r) , pI, cellular location, or biological functions. These results demonstrate that the use of the CPE approach provides improved efficiency in comprehensive proteome-wide analyses of highly complex mammalian biological systems.  相似文献   

8.
Ideally, shotgun proteomics would facilitate the identification of an entire proteome with 100% protein sequence coverage. In reality, the large dynamic range and complexity of cellular proteomes results in oversampling of abundant proteins, while peptides from low abundance proteins are undersampled or remain undetected. We tested the proteome equalization technology, ProteoMiner, in conjunction with Multidimensional Protein Identification Technology (MudPIT) to determine how the equalization of protein dynamic range could improve shotgun proteomics methods for the analysis of cellular proteomes. Our results suggest low abundance protein identifications were improved by two mechanisms: (1) depletion of high abundance proteins freed ion trap sampling space usually occupied by high abundance peptides and (2) enrichment of low abundance proteins increased the probability of sampling their corresponding more abundant peptides. Both mechanisms also contributed to dramatic increases in the quantity of peptides identified and the quality of MS/MS spectra acquired due to increases in precursor intensity of peptides from low abundance proteins. From our large data set of identified proteins, we categorized the dominant physicochemical factors that facilitate proteome equalization with a hexapeptide library. These results illustrate that equalization of the dynamic range of the cellular proteome is a promising methodology to improve low abundance protein identification confidence, reproducibility, and sequence coverage in shotgun proteomics experiments, opening a new avenue of research for improving proteome coverage.  相似文献   

9.
We report a global proteomic approach for analyzing brain tissue and for the first time a comprehensive characterization of the whole mouse brain proteome. Preparation of the whole brain sample incorporated a highly efficient cysteinyl-peptide enrichment (CPE) technique to complement a global enzymatic digestion method. Both the global and the cysteinyl-enriched peptide samples were analyzed by SCX fractionation coupled with reversed phase LC-MS/MS analysis. A total of 48,328 different peptides were confidently identified (>98% confidence level), covering 7792 nonredundant proteins ( approximately 34% of the predicted mouse proteome). A total of 1564 and 1859 proteins were identified exclusively from the cysteinyl-peptide and the global peptide samples, respectively, corresponding to 25% and 31% improvements in proteome coverage compared to analysis of only the global peptide or cysteinyl-peptide samples. The identified proteins provide a broad representation of the mouse proteome with little bias evident due to protein pI, molecular weight, and/or cellular localization. Approximately 26% of the identified proteins with gene ontology (GO) annotations were membrane proteins, with 1447 proteins predicted to have transmembrane domains, and many of the membrane proteins were found to be involved in transport and cell signaling. The MS/MS spectrum count information for the identified proteins was used to provide a measure of relative protein abundances. The mouse brain peptide/protein database generated from this study represents the most comprehensive proteome coverage for the mammalian brain to date, and the basis for future quantitative brain proteomic studies using mouse models. The proteomic approach presented here may have broad applications for rapid proteomic analyses of various mouse models of human brain diseases.  相似文献   

10.
A comprehensive understanding of the mouse plasma proteome is important for studies using mouse models to identify protein markers of human disease. To enhance our analysis of the mouse plasma proteome, we have developed a method for isolating low-abundance proteins using a cysteine-containing glycopeptide strategy. This method involves two orthogonal affinity capture steps. First, glycoproteins are coupled to an azlactone copolymer gel using hydrazide chemistry and cysteine residues are then biotinylated. After trypsinization and extensive washing, tethered N-glycosylated tryptic peptides are released from the gel using PNGase F. Biotinylated cysteinyl-containing glycopeptides are then affinity selected using a monomeric avidin gel and analyzed by LC-MS/MS. We have applied the method to a proteome analysis of mouse plasma. In two independent analyses using 200 muL each of C57BL mouse plasma, 51 proteins were detected. Only 42 proteins were seen when the same plasma sample was analyzed by glycopeptides only. A total of 104 N-glycosylation sites were identified. Of these, 17 sites have hitherto not been annotated in the Swiss-Prot database whereas 48 were considered probable, potential, or by similarity - i.e., based on little or no experimental evidence. We show that analysis by cysteine-containing glycopeptides allows detection of low-abundance proteins such as the epidermal growth factor receptor, the Vitamin K-dependent protein Z, the hepatocyte growth factor activator, and the lymphatic endothelium-specific hyaluronan receptor as these proteins were not detected in the glycopeptide control analysis.  相似文献   

11.
We have developed a systematic analytical approach, termed PRISM (Proteomic Investigation Strategy for Mammals), that permits routine, large scale protein expression profiling of mammalian cells and tissues. PRISM combines subcellular fractionation, multidimensional liquid chromatography-tandem mass spectrometry-based protein shotgun sequencing, and two newly developed computer algorithms, STATQUEST and GOClust, as a means to rapidly identify, annotate, and categorize thousands of expressed mammalian proteins. The application of PRISM to adult mouse lung and liver resulted in the high confidence identification of over 2,100 unique proteins including more than 100 integral membrane proteins, 400 nuclear proteins, and 500 uncharacterized proteins, the largest proteome study carried out to date on this important model organism. Automated clustering of the identified proteins into Gene Ontology annotation groups allowed for streamlined analysis of the large data set, revealing interesting and physiologically relevant patterns of tissue and organelle specificity. PRISM therefore offers an effective platform for in-depth investigation of complex mammalian proteomes.  相似文献   

12.
Here, a comprehensive proteomic analysis of the chromoplasts purified from sweet orange using Nycodenz density gradient centrifugation is reported. A GeLC-MS/MS shotgun approach was used to identify the proteins of pooled chromoplast samples. A total of 493 proteins were identified from purified chromoplasts, of which 418 are putative plastid proteins based on in silico sequence homology and functional analyses. Based on the predicted functions of these identified plastid proteins, a large proportion (~60%) of the chromoplast proteome of sweet orange is constituted by proteins involved in carbohydrate metabolism, amino acid/protein synthesis, and secondary metabolism. Of note, HDS (hydroxymethylbutenyl 4-diphosphate synthase), PAP (plastid-lipid-associated protein), and psHSPs (plastid small heat shock proteins) involved in the synthesis or storage of carotenoid and stress response are among the most abundant proteins identified. A comparison of chromoplast proteomes between sweet orange and tomato suggested a high level of conservation in a broad range of metabolic pathways. However, the citrus chromoplast was characterized by more extensive carotenoid synthesis, extensive amino acid synthesis without nitrogen assimilation, and evidence for lipid metabolism concerning jasmonic acid synthesis. In conclusion, this study provides an insight into the major metabolic pathways as well as some unique characteristics of the sweet orange chromoplasts at the whole proteome level.  相似文献   

13.
Determination of the binding motif and identification of interaction partners of the modular domains such as SH2 domains can enhance our understanding of the regulatory mechanism of protein-protein interactions. We propose here a new computational method to achieve this goal by integrating the orthogonal information obtained from binding free energy estimation and peptide sequence analysis. We performed a proof-of-concept study on the SH2 domains of SAP and Grb2 proteins. The method involves the following steps: (1) estimating the binding free energy of a set of randomly selected peptides along with a sample of known binders; (2) clustering all these peptides using sequence and energy characteristics; (3) extracting a sequence motif, which is represented by a hidden Markov model (HMM), from the cluster of peptides containing the sample of known binders; and (4) scanning the human proteome to identify binding sites of the domain. The binding motifs of the SAP and Grb2 SH2 domains derived by the method agree well with those determined through experimental studies. Using the derived binding motifs, we have predicted new possible interaction partners for the Grb2 and SAP SH2 domains as well as possible interaction sites for interaction partners already known. We also suggested novel roles for the proteins by reviewing their top interaction candidates.  相似文献   

14.
Park GW  Kwon KH  Kim JY  Lee JH  Yun SH  Kim SI  Park YM  Cho SY  Paik YK  Yoo JS 《Proteomics》2006,6(4):1121-1132
In shotgun proteomics, proteins can be fractionated by 1-D gel electrophoresis and digested into peptides, followed by liquid chromatography to separate the peptide mixture. Mass spectrometry generates hundreds of thousands of tandem mass spectra from these fractions, and proteins are identified by database searching. However, the search scores are usually not sufficient to distinguish the correct peptides. In this study, we propose a confident protein identification method for high-throughput analysis of human proteome. To build a filtering protocol in database search, we chose Pseudomonas putida KT2440 as a reference because this bacterial proteome contains fewer modifications and is simpler than the human proteome. First, the P. putida KT2440 proteome was filtered by reversed sequence database search and correlated by the molecular weight in 1-D-gel band positions. The characterization protocol was then applied to determine the criteria for clustering of the human plasma proteome into three different groups. This protein filtering method, based on bacterial proteome data analysis, represents a rapid way to generate higher confidence protein list of the human proteome, which includes some of heavily modified and cleaved proteins.  相似文献   

15.
Analysis of the mouse liver proteome using advanced mass spectrometry   总被引:3,自引:0,他引:3  
We report a large-scale analysis of mouse liver tissue comprising a novel fractionation approach and high-accuracy mass spectrometry techniques. Two fractions enriched for soluble and membrane proteins from 20 mg of frozen tissue were separated by one-dimensional electrophoresis followed by LC-MS/MS on the hybrid linear ion trap (LTQ)-Orbitrap mass spectrometer. Confident identification of 2210 proteins relied on at least two peptides. We combined this proteome with our previously reported organellar map (Foster et al. Cell 2006, 125, 187-199) to generate a very high confidence mouse liver proteome of 3244 proteins. The identified proteins represent the liver proteome with no discernible bias due to protein physicochemical properties, subcellular distribution, or biological function. Forty-seven percent of identified proteins were annotated as membrane-bound, and for 35.3%, transmembrane domains were predicted. For potential application in toxicology or clinical studies, we demonstrate that it is possible to consistently identify more than 1000 proteins in a single run.  相似文献   

16.
One of the challenges associated with large-scale proteome analysis using tandem mass spectrometry (MS/MS) and automated database searching is to reduce the number of false positive identifications without sacrificing the number of true positives found. In this work, a systematic investigation of the effect of 2MEGA labeling (N-terminal dimethylation after lysine guanidination) on the proteome analysis of a membrane fraction of an Escherichia coli cell extract by 2-dimensional liquid chromatography MS/MS is presented. By a large-scale comparison of MS/MS spectra of native peptides with those from the 2MEGA-labeled peptides, the labeled peptides were found to undergo facile fragmentation with enhanced a1 or a1-related (a(1)-17 and a(1)-45) ions derived from all N-terminal amino acids in the MS/MS spectra; these ions are usually difficult to detect in the MS/MS spectra of nonderivatized peptides. The 2MEGA labeling alleviated the biased detection of arginine-terminated peptides that is often observed in MALDI and ESI MS experiments. 2MEGA labeling was found not only to increase the number of peptides and proteins identified but also to generate enhanced a1 or a1-related ions as a constraint to reduce the number of false positive identifications. In total, 640 proteins were identified from the E. coli membrane fraction, with each protein identified based on peptide mass and sequence match of one or more peptides using MASCOT database search algorithm from the MS/MS spectra generated by a quadrupole time-of-flight mass spectrometer. Among them, the subcellular locations of 336 proteins are presently known, including 258 membrane and membrane-associated proteins (76.8%). Among the classified proteins, there was a dramatic increase in the total number of integral membrane proteins identified in the 2MEGA-labeled sample (153 proteins) versus the unlabeled sample (77 proteins).  相似文献   

17.
Angiogenesis, or neovascularization, is tightly orchestrated by endogenous regulators that promote or inhibit the process. The fine-tuning of these pro- and anti-angiogenic elements (the angiogenic balance) helps establish the homeostasis in tissues, and any aberration leads to pathologic conditions. The type I thrombospondin repeats are a family of protein structural elements involved in the control of angiogenesis, and some proteins containing these repeats have been identified as negative regulators of angiogenesis. Here we identify a set of 11 novel, anti-angiogenic 18–20-amino acid peptides that are derived from proteins that belong to the CCN protein family and contain type I thrombospondin motifs. We have named these peptides spondinstatin-1, cyrostatin, connectostatin, nephroblastostatin, wispostatin-2, wispostatin-3, netrinstatin-5C, netrinstatin-5D, adamtsostatin-like-4, fibulostatin-6.1, and complestatin-C6 to reflect their origin. We have shown that these peptides inhibit proliferation and migration of human umbilical vein endothelial cells in vitro. By conducting a clustering analysis of the amino acid sequences using sequence similarity criteria and of the experimental results using a hierarchical clustering algorithm, we have demonstrated that there is an underlying correlation between the sequence and activity of the identified peptides. This combination of experimental and computational approaches introduces a novel systematic framework for studying peptide activity, identifying novel peptides with anti-angiogenic activity, and designing mimetic peptides with tailored properties.  相似文献   

18.
Understanding the progression of periodontal tissue destruction is at the forefront of periodontal research. The authors aimed to capture the dynamics of gingival tissue proteome during the initiation and progression of experimental (ligature‐induced) periodontitis in mice. Pressure cycling technology (PCT), a recently developed platform that uses ultra‐high pressure to disrupt tissues, is utilized to achieve efficient and reproducible protein extraction from ultra‐small amounts of gingival tissues in combination with liquid chromatography‐tandem mass spectrometry (MS). The MS data are processed using Progenesis QI and the regulated proteins are subjected to METACORE, STRING, and WebGestalt for functional enrichment analysis. A total of 1614 proteins with ≥2 peptides are quantified with an estimated protein false discovery rate of 0.06%. Unsupervised clustering analysis shows that the gingival tissue protein abundance is mainly dependent on the periodontitis progression stage. Gene ontology enrichment analysis reveals an overrepresentation in innate immune regulation (e.g., neutrophil‐mediated immunity and antimicrobial peptides), signal transduction (e.g., integrin signaling), and homeostasis processes (e.g., platelet activation and aggregation). In conclusion, a PCT‐assisted label‐free quantitative proteomics workflow that allowed cataloging the deepest gingival tissue proteome on a rapid timescale and provided novel mechanistic insights into host perturbation during periodontitis progression is applied.  相似文献   

19.
20.
The identification of disease markers in human body fluids requires an extensive and thorough analysis of its protein constituents. In the present study, we have extended our analysis of the human cerebrospinal fluid (CSF) proteome using protein prefractional followed by shotgun mass spectrometry. After the removal of abundant protein components from the mixture with the help of immunodepletion affinity chromatography, we used either anion exchange chromatography or sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) to further subfractionate the proteins present in CSFs. Each protein subfraction was enzyme digested and analyzed by tandem mass spectrometry and the resulting data evaluated using the Spectrum Mill software. Different subfractionation methods resulted in the identification of a grant total of 259 proteins in CSF from a patient with normal pressure hydrocephalus. The greatest number of protein, 240 in total, were identified after prefractionating the CSF proteins by immunodepletion and SDS-PAGE. Immuno-depletion combined with anion exchange fractionation resulted in 112 proteins and 74 proteins were found when only immunodepletion of the CSF samples was carried out. All methods used showed a significant increase in the number of identified proteins as compared with nondepleted and unfractionated CSF sample analysis, which yielded only 38 protein identifications. The present work establishes a platform for future studies aimed at a detailed comparative proteome analysis of CSFs from different groups of patients suffering from various psychiatric and neurological disorders.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号