首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Large-scale protein identifications from highly complex protein mixtures have recently been achieved using multidimensional liquid chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) and subsequent database searching with algorithms such as SEQUEST. Here, we describe a probability-based evaluation of false positive rates associated with peptide identifications from three different human proteome samples. Peptides from human plasma, human mammary epithelial cell (HMEC) lysate, and human hepatocyte (Huh)-7.5 cell lysate were separated by strong cation exchange (SCX) chromatography coupled offline with reversed-phase capillary LC-MS/MS analyses. The MS/MS spectra were first analyzed by SEQUEST, searching independently against both normal and sequence-reversed human protein databases, and the false positive rates of peptide identifications for the three proteome samples were then analyzed and compared. The observed false positive rates of peptide identifications for human plasma were significantly higher than those for the human cell lines when identical filtering criteria were used, suggesting that the false positive rates are significantly dependent on sample characteristics, particularly the number of proteins found within the detectable dynamic range. Two new sets of filtering criteria are proposed for human plasma and human cell lines, respectively, to provide an overall confidence of >95% for peptide identifications. The new criteria were compared, using a normalized elution time (NET) criterion (Petritis et al. Anal. Chem. 2003, 75, 1039-1048), with previously published criteria (Washburn et al. Nat. Biotechnol. 2001, 19, 242-247). The results demonstrate that the present criteria provide significantly higher levels of confidence for peptide identifications from mammalian proteomes without greatly decreasing the number of identifications.  相似文献   

2.
We describe the application of a peptide retention time reversed phase liquid chromatography (RPLC) prediction model previously reported (Petritis et al. Anal. Chem. 2003, 75, 1039) for improved peptide identification. The model uses peptide sequence information to generate a theoretical (predicted) elution time that can be compared with the observed elution time. Using data from a set of known proteins, the retention time parameter was incorporated into a discriminant function for use with tandem mass spectrometry (MS/MS) data analyzed with the peptide/protein identification program SEQUEST. For singly charged ions, the number of confident identifications increased by 12% when the elution time metric is included compared to when mass spectral data is the sole source of information in the context of a Drosophila melanogaster database. A 3-4% improvement was obtained for doubly and triply charged ions for the same biological system. Application to the larger Rattus norvegicus (rat) and human proteome databases resulted in an 8-9% overall increase in the number of confident identifications, when both the discriminant function and elution time are used. The effect of adding "runner-up" hits (peptide matches that are not the highest scoring for a spectra) from SEQUEST is also explored, and we find that the number of confident identifications is further increased by 1% when these hits are also considered. Finally, application of the discriminant functions derived in this work with approximately 2.2 million spectra from over three hundred LC-MS/MS analyses of peptides from human plasma protein resulted in a 16% increase in confident peptide identifications (9022 vs 7779) using elution time information. Further improvements from the use of elution time information can be expected as both the experimental control of elution time reproducibility and the predictive capability are improved.  相似文献   

3.
We report on the effectiveness of CID, HCD, and ETD for LC-FT MS/MS analysis of peptides using a tandem linear ion trap-Orbitrap mass spectrometer. A range of software tools and analysis parameters were employed to explore the use of CID, HCD, and ETD to identify peptides (isolated from human blood plasma) without the use of specific "enzyme rules". In the evaluation of an FDR-controlled SEQUEST scoring method, the use of accurate masses for fragments increased the number of identified peptides (by ~50%) compared to the use of conventional low accuracy fragment mass information, and CID provided the largest contribution to the identified peptide data sets compared to HCD and ETD. The FDR-controlled Mascot scoring method provided significantly fewer peptide identifications than SEQUEST (by 1.3-2.3 fold) and CID, HCD, and ETD provided similar contributions to identified peptides. Evaluation of de novo sequencing and the UStags method for more intense fragment ions revealed that HCD afforded more contiguous residues (e.g., ≥ 7 amino acids) than either CID or ETD. Both the FDR-controlled SEQUEST and Mascot scoring methods provided peptide data sets that were affected by the decoy database used and mass tolerances applied (e.g., identical peptides between data sets could be limited to ~70%), while the UStags method provided the most consistent peptide data sets (>90% overlap). The m/z ranges in which CID, HCD, and ETD contributed the largest number of peptide identifications were substantially overlapping. This work suggests that the three peptide ion fragmentation methods are complementary and that maximizing the number of peptide identifications benefits significantly from a careful match with the informatics tools and methods applied. These results also suggest that the decoy strategy may inaccurately estimate identification FDRs.  相似文献   

4.
High-throughput protein identification in mass spectrometry is predominantly achieved by first identifying tryptic peptides by a database search and then by combining the peptide hits for protein identification. One of the popular tools used for the database search is SEQUEST. Peptide identification is carried out by selecting SEQUEST hits above a specified threshold, the value of which is typically chosen empirically in an attempt to separate true identifications from false ones. These SEQUEST scores are not normalized with respect to the composition, length and other parameters of the peptides. Furthermore, there is no rigorous reliability estimate assigned to the protein identifications derived from these scores. Hence, the interpretation of SEQUEST hits generally requires human involvement, making it difficult to scale up the identification process for genome-scale applications. To overcome these limitations, we have developed a method, which combines a neural network and a statistical model, for normalizing SEQUEST scores, and also for providing a reliability estimate for each SEQUEST hit. This method improves the sensitivity and specificity of peptide identification compared to the standard filtering procedure used in the SEQUEST package, and provides a basis for estimating the reliability of protein identifications.  相似文献   

5.
Mass spectrometers that provide high mass accuracy such as FT-ICR instruments are increasingly used in proteomic studies. Although the importance of accurately determined molecular masses for the identification of biomolecules is generally accepted, its role in the analysis of shotgun proteomic data has not been thoroughly studied. To gain insight into this role, we used a hybrid linear quadrupole ion trap/FT-ICR (LTQ FT) mass spectrometer for LC-MS/MS analysis of a highly complex peptide mixture derived from a fraction of the yeast proteome. We applied three data-dependent MS/MS acquisition methods. The FT-ICR part of the hybrid mass spectrometer was either not exploited, used only for survey MS scans, or also used for acquiring selected ion monitoring scans to optimize mass accuracy. MS/MS data were assigned with the SEQUEST algorithm, and peptide identifications were validated by estimating the number of incorrect assignments using the composite target/decoy database search strategy. We developed a simple mass calibration strategy exploiting polydimethylcyclosiloxane background ions as calibrant ions. This strategy allowed us to substantially improve mass accuracy without reducing the number of MS/MS spectra acquired in an LC-MS/MS run. The benefits of high mass accuracy were greatest for assigning MS/MS spectra with low signal-to-noise ratios and for assigning phosphopeptides. Confident peptide identification rates from these data sets could be doubled by the use of mass accuracy information. It was also shown that improving mass accuracy at a cost to the MS/MS acquisition rate substantially lowered the sensitivity of LC-MS/MS analyses. The use of FT-ICR selected ion monitoring scans to maximize mass accuracy reduced the number of protein identifications by 40%.  相似文献   

6.
A very popular approach in proteomics is the so-called "shotgun LC-MS/MS" strategy. In its mostly used form, a total protein digest is separated by ion exchange fractionation in the first dimension followed by off- or on-line RP LC-MS/MS. We replaced the first dimension by isoelectric focusing in the liquid phase using the Off-Gel device producing 15 fractions. As peptides are separated by their isoelectric point in the first dimension and hydrophobicity in the second, those experimentally derived parameters (pI and R(T)) can be used for the validation of potentially identified peptides. We applied this strategy to a cellular extract of Drosophila Kc167 cells and identified peptides with two different database search engines, namely PHENYX and SEQUEST, with PeptideProphet validation of the SEQUEST results. PHENYX returned 7582 potential peptide identifications and SEQUEST 7629. The SEQUEST results were reduced to 2006 identifications by validation with PeptideProphet. Validation of the PeptideProphet, SEQUEST and PHENYX results by pI and R(T) parameters confirmed 1837 PeptideProphet identifications while in the remainder of the SEQUEST results another 1130 peptides were found to be likely hits. The validation on PHENYX resulted in the fixation of a solid p-value threshold of <1 x 10(-04) that sets by itself the correct identification confidence to >95%, and a final count of 2034 highly confident peptide identifications was achieved after pI and R(T) validation. Although the PeptideProphet and PHENYX datasets have a very high confidence the overlap of common identifications was only at 79.4%, to be explained by the fact that data interpretation was done searching different protein databases with two search engines of different algorithms. The approach used in this study allowed for an automated and improved data validation process for shotgun proteomics projects producing MS/MS peptide identification results of very high confidence.  相似文献   

7.
Collision‐activated dissociation and electron‐transfer dissociation (ETD) each produce spectra containing unique features. Though several database search algorithms (e.g. SEQUEST, MASCOT, and Open Mass Spectrometry Search Algorithm) have been modified to search ETD data, this consists chiefly of the ability to search for c‐ and z?‐ions; additional ETD‐specific features are often unaccounted for and may hinder identification. Removal of these features via spectral processing increased total search sensitivity by ~20% for both human and yeast data sets; unique peptide identifications increased by ~17% for the yeast data sets and ~16% for the human data set.  相似文献   

8.
We have developed a proteomics technology featuring on-line three-dimensional liquid chromatography coupled to tandem mass spectrometry (3D LC-MS/MS). Using 3D LC-MS/MS, the yeast-soluble, urea-solubilized peripheral membrane and SDS-solubilized membrane protein samples collectively yielded 3019 unique yeast protein identifications with an average of 5.5 peptides per protein from the 6300-gene Saccharomyces Genome Database searched with SEQUEST. A single run of the urea-solubilized sample yielded 2255 unique protein identifications, suggesting high peak capacity and resolving power of 3D LC-MS/MS. After precipitation of SDS from the digested membrane protein sample, 3D LC-MS/MS allowed the analysis of membrane proteins. Among 1221 proteins containing two or more predicted transmembrane domains, 495 such proteins were identified. The improved yeast proteome data allowed the mapping of many metabolic pathways and functional categories. The 3D LC-MS/MS technology provides a suitable tool for global proteome discovery.  相似文献   

9.
Identifying deamidated peptides using low-resolution mass spectrometry is difficult because traditional database search programs cannot accurately detect modified peptides when the mass differences are only 0.984 Da. In this study, we utilized differential reversed-phase elution behavior of deamidated and corresponding unmodified peptide forms to significantly improve deamidation detection on a low-resolution LCQ ion trap instrument. We also improved the mass measurements of unmodified and deamidated peptide forms by averaging survey scans across each chromatogram peak. Tryptic digests of a series of normal (3-day old, 2-year old, 18-year old, 35-year old, and 70-year old) and cataractous (93-year old) human lens samples were used to produce large numbers of potentially deamidated peptides. The complex peptide mixtures were separated by strong cation exchange (SCX) chromatography followed by reversed-phase (RP) chromatography. Synthetic peptides were used to show that unmodified and deamidated peptides coeluted during the SCX separation and were completely resolved with the RP conditions used. Retention time shifts (RTS) and mass differences (DeltaM) of deamidated lens peptides and their corresponding unmodified forms were manually determined for the 70-year old lens sample. These values were used to assign correct or incorrect deamidation identifications from SEQUEST searches where deamidation was specified as a variable modification. Manual validation of SEQUEST identifications from synthetic peptides, 3-day old, and 70-year old samples had an overall 42% deamidation detection accuracy. Filtering SEQUEST identifications using RTS and DeltaM constraints resulted in >93% deamidation detection accuracy. An algorithm was developed to automate this method, and 72 Crystallin deamidation sites, 18 of which were not previously reported in human lens tissue, were detected.  相似文献   

10.
Proteins from human liver carcinoma Huh7 cells, representing transformed liver cells, and cultured primary human fetal hepatocytes (HFH) and human HH4 hepatocytes, representing nontransformed liver cells, were extracted and processed for proteome analysis. Proteins from stimulated cells (interferon-alpha treatment for the Huh7 and HFH cells and induction of hepatitis C virus [HCV] proteins for the HH4 cells) and corresponding control cells were labeled with light and heavy cleavable ICAT reagents, respectively. The labeled samples were combined, trypsinized, and subject to cation-exchange and avidin-affinity chromatographies. The resulting cysteine-containing peptides were analyzed by microcapillary LC-MS/MS. The MS/MS spectra were initially analyzed by searching the human International Protein Index database using the SEQUEST software (1). Subsequently, new statistical algorithms were applied to the collective SEQUEST search results of each experiment. First, the PeptideProphet software (2) was applied to discriminate true assignments of MS/MS spectra to peptide sequences from false assignments, to assign a probability value for each identified peptide, and to compute the sensitivity and error rate for the assignment of spectra to sequences in each experiment. Second, the ProteinProphet software (3) was used to infer the protein identifications and to compute probabilities that a protein had been correctly identified, based on the available peptide sequence evidence. The resulting protein lists were filtered by a ProteinProphet probability score p > or = 0.5, which corresponded to an error rate of less than 5%. A total of 1,296, 1,430, and 1,476 proteins or related protein groups were identified in three subdatasets from the Huh7, HFH, and HH4 cells, respectively. In total, these subdatasets contained 2,486 unique protein identifications from human liver cells. An increase of the threshold to p > or = 0.9 (corresponding to an error rate of less than 1%) resulted in 2,159 unique protein identifications (1,146, 1,235, and 1,318 for the Huh7, HFH, and HH4 cells, respectively).  相似文献   

11.
The components of complex peptide mixtures can be separated by liquid chromatography, fragmented by tandem mass spectrometry, and identified by the SEQUEST algorithm. Inferring a mixture's source proteins requires that the identified peptides be reassociated. This process becomes more challenging as the number of peptides increases. DTASelect, a new software package, assembles SEQUEST identifications and highlights the most significant matches. The accompanying Contrast tool compares DTASelect results from multiple experiments. The two programs improve the speed and precision of proteomic data analysis.  相似文献   

12.
The identification of proteins from spectra derived from a tandem mass spectrometry experiment involves several challenges: matching each observed spectrum to a peptide sequence, ranking the resulting collection of peptide-spectrum matches, assigning statistical confidence estimates to the matches, and identifying the proteins. The present work addresses algorithms to rank peptide-spectrum matches. Many of these algorithms, such as PeptideProphet, IDPicker, or Q-ranker, follow a similar methodology that includes representing peptide-spectrum matches as feature vectors and using optimization techniques to rank them. We propose a richer and more flexible feature set representation that is based on the parametrization of the SEQUEST XCorr score and that can be used by all of these algorithms. This extended feature set allows a more effective ranking of the peptide-spectrum matches based on the target-decoy strategy, in comparison to a baseline feature set devoid of these XCorr-based features. Ranking using the extended feature set gives 10-40% improvement in the number of distinct peptide identifications relative to a range of q-value thresholds. While this work is inspired by the model of the theoretical spectrum and the similarity measure between spectra used specifically by SEQUEST, the method itself can be applied to the output of any database search. Further, our approach can be trivially extended beyond XCorr to any linear operator that can serve as similarity score between experimental spectra and peptide sequences.  相似文献   

13.
Lipid rafts were prepared according to standard protocols from Jurkat T cells stimulated via T cell receptor/CD28 cross-linking and from control (unstimulated) cells. Co-isolating proteins from the control and stimulated cell preparations were labeled with isotopically normal (d0) and heavy (d8) versions of the same isotope-coded affinity tag (ICAT) reagent, respectively. Samples were combined, proteolyzed, and resultant peptides fractionated via cation exchange chromatography. Cysteine-containing (ICAT-labeled) peptides were recovered via the biotin tag component of the ICAT reagents by avidin-affinity chromatography. On-line micro-capillary liquid chromatography tandem mass spectrometry was performed on both avidin-affinity (ICAT-labeled) and flow-through (unlabeled) fractions. Initial peptide sequence identification was by searching recorded tandem mass spectrometry spectra against a human sequence data base using SEQUEST software. New statistical data modeling algorithms were then applied to the SEQUEST search results. These allowed for discrimination between likely "correct" and "incorrect" peptide assignments, and from these the inferred proteins that they collectively represented, by calculating estimated probabilities that each peptide assignment and subsequent protein identification was a member of the "correct" population. For convenience, the resultant lists of peptide sequences assigned and the proteins to which they corresponded were filtered at an arbitrarily set cut-off of 0.5 (i.e. 50% likely to be "correct") and above and compiled into two separate datasets. In total, these data sets contained 7667 individual peptide identifications, which represented 2669 unique peptide sequences, corresponding to 685 proteins and related protein groups.  相似文献   

14.
Scaife C  Mowlds P  Grassl J  Polden J  Daly CN  Wynne K  Dunn MJ  Clyne RK 《Proteomics》2010,10(24):4401-4414
Meiosis is the cell division that generates haploid gametes from diploid precursors. To provide insight into the functional proteome of budding yeast during meiosis, a 2-D DIGE kinetic approach was used to study proteins in the pH 6-11 range. Nearly 600 protein spots were visualised and 79 spots exhibited statistically significant changes in abundance as cells progressed through meiosis. Expression changes of up to 41-fold were detected and protein sequence information was obtained for 48 spots. Single protein identifications were obtained for 21 spots including different gel mobility forms of 5 proteins. A large number of post-translational events are suggested for these proteins, including processing, modification and import. The data are incorporated into an online 2-DE map of meiotic proteins in budding yeast, which extends our initial DIGE investigation of proteins in the pH 4-7 range. Together, the analyses provide peptide sequence data for 84 protein spots, including 50 single-protein identifications and gel mobility isoforms of 8 proteins. The largest classes of identified proteins include carbon metabolism, protein catabolism, protein folding, protein synthesis and the oxidative stress response. A number of the corresponding genes are required for yeast meiosis and recent studies have identified similar classes of proteins expressed during mammalian meiosis. This proteomic investigation and the resulting protein reference map make an important contribution towards a more detailed molecular view of yeast meiosis.  相似文献   

15.
The phosphorylation sites of two phosphorylated proteins, bovine β-casein and myelin basic protein (MBP), were identified by high performance liquid chromatography-electrospray ionization-quadrupole ion trap mass spectrometry (HPLC-ESI-QITMS). The tryptic digest of each protein was separated by HPLC, the molecular weight of each peptide was determined by ESI-QITMS on line, and MS/MS spectrum of each peptide was simultaneously obtained by the combination of collision-induced desorption (CID) technique and tandem mass spectrometry (MS/MS) of QITMS. The phosphorylated peptide was identified by looking into whether the difference between the observed and predicted molecular weights of a peptide is 80 u or its integral multiple. Then the phosphorylation site was identified through manual interpretation of the MS/MS spectrum of the phosphorylated peptide or automatic SEQUEST data base-searching.  相似文献   

16.
The phosphorylation sites of two phosphorylated proteins, bovine β-casein and myelin basic protein (MBP), were identified by high performance liquid chromatography-electrospray ionization-quadrupole ion trap mass spectrometry (HPLC-ESI-QITMS). The tryptic digest of each protein was separated by HPLC, the molecular weight of each peptide was determined by ESI-QITMS on line, and MS/MS spectrum of each peptide was simultaneously obtained by the combination of collision-induced desorption (CID) technique and tandem mass spectrometry (MS/MS) of QITMS. The phosphorylated peptide was identified by looking into whether the difference between the observed and predicted molecular weights of a peptide is 80 u or its integral multiple. Then the phosphorylation site was identified through manual interpretation of the MS/MS spectrum of the phosphorylated peptide or automatic SEQUEST data base-searching.  相似文献   

17.
To screen and identify serum biomarkers for nephroblastoma in children using surface-enhanced laser desorption/ionization (SELDI) and other proteomics technologies. The surface-enhanced laser desorption/ionization time of flight mass spectrometry (SELDI–TOF-MS) was used to identify biomarkers in 100 children with nephroblastoma and 30 gender and age-matched normal healthy children. There were 30 cases of pre-operative patients and 70 cases of post-operative patients. Differentially expressed serum proteins were screened. The target proteins were then separated, purified, and analyzed by multidimensional high performance liquid chromatography (HPLC). The peptide mass fingerprints (PMFs) of each protein were obtained after scanning with LC-MS/MS (LTQ). The proteins were identified using SEQUEST and the biological functions and characterizations of these proteins were analyzed with bioinformatic methods. Two differential proteins (m/z 6455.5, 9190.8) were obtained. According to SEQUEST, the molecular masses of this two proteins indicated that they were apolipoprotein C-I and haptoglobin, respectively. Expressions of the two proteins were lower in the pre-surgery group compared with the post-surgery and control group (P < 0.01). In contrast, the expression of this two proteins were higher in the early stage than in the advanced stage of nephroblastoma. Apolipoprotein C-I and haptoglobin may be used as potential biomarkers to predict the degree of malignancy, therapeutic outcomes, and prognosis of nephroblastoma in children.  相似文献   

18.
The catalytic subunit of Saccharomyces cerevisiae type 1 protein phosphatase (PP1(C)) is encoded by the essential gene GLC7 and is involved in regulating diverse cellular processes. To identify potential regulatory or targeting subunits of yeast PP1(C), we tagged Glc7p at its amino terminus with protein A and affinity-purified Glc7p protein complexes from yeast. The purified proteins were separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and identified by peptide mass fingerprint analysis using matrix-assisted laser desorption/ionization (MALDI) mass spectrometry. To confirm the accuracy of our identifications, peptides from some of the proteins were also sequenced using high-performance liquid chromatography (HPLC) coupled to tandem mass spectrometry. Only four of the Glc7p-associated proteins that we identified (Mhp1p, Bni4p, Ref2p, and Sds22p) have previously been shown to interact with Glc7p, and multiple components of the CPF (cleavage and polyadenylation factor) complex involved in messenger RNA 3'-end processing were present as major components in the Glc7p-associated protein fraction. To confirm the interaction of Glc7p with this complex, we used the same approach to purify and characterize the components of the yeast CPF complex using protein A-tagged Pta1p. Six known components of the yeast (CPF) complex, together with Glc7p, were identified among the Pta1p-associated polypeptides using peptide mass fingerprint analysis. Thus Glc7p is a novel component of the CPF complex and may therefore be involved regulating mRNA 3'-end processing.  相似文献   

19.
用于串联质谱鉴定多肽的计量方法   总被引:1,自引:0,他引:1  
目前已有多种对串联质谱与数据库中多肽的理论质谱的一致性进行评估的高通量计量算法用于鸟枪法蛋白质组学 (shotgunproteomics)研究。然而这些方法操作时存在大量错误的多肽鉴定。这里提出一种新的串联质谱识别多肽序列的计量算法。该算法综合考虑了串联质谱中不同离子出现的概率、多肽的酶切位点数、理论离子与实验离子的匹配程度和匹配模式。对大容量的串联质谱数据集的测试表明 ,根据算法开发的软件PepSearch比目前最常用的软件SEQUEST有更好的鉴定准确性。PepSearch可从http : compbio.sibsnet.org projects pepsearch下载。  相似文献   

20.
Conditions for high-cell-density fermentations of Saccharomyces cerevisiae strains producing recombinant-DNA-derived proteins were established. Strains producing human immune interferon (IFN-gamma) from the constitutive PGK promoter failed to grow to high cell densities and exhibited low plasmid stability. Regulated expression of IFN-gamma was obtained in similar strains by employing a hybrid yeast GPD promoter that was subject to carbon source regulation due to the presence of regulatory DNA sequences from the yeast GAL 1,10 intergenic region. IFN-gamma expression programmed by this vector was low during growth on glucose and was induced by galactose. Previously defined fermentation conditions employing glucose as a carbon source were applied to this strain, resulting in high ceil densities with higher plasmid stability. Various methods of galactose induction of IFN-gamma expression in high-cell-density fermentations were investigated. Optimal conditions resulted in a 2000-fold induction and production of 2 g IFN-gamma/L fermentation culture.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号