首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Proteomics uses tandem mass spectrometers and correlation algorithms to match peptides and their fragment spectra to amino acid sequences. The replication of multiple liquid chromatography experiments with electrospray ionization of peptides and tandem mass spectrometry (LC–ESI–MS/MS) produces large sets of MS/MS spectra. There is a need to assess the quality of large sets of experimental results by statistical comparison with that of random expectation. Classical frequency-based statistics such as goodness-of-fit tests for peptide-to-protein distributions could be used to calculate the probability that an entire set of experimental results has arisen by random chance. The frequency distributions of authentic MS/MS spectra from human blood were compared with those of false positive MS/MS spectra generated by a computer, or instrument noise, using the chi-square test. Here the mechanics of the chi-square test to compare the results in toto from a set of LC–ESI–MS/MS experiments with those of random expectation is detailed. The chi-square analysis of authentic spectra demonstrates unambiguously that the analysis of blood proteins separated by partition chromatography prior to tryptic digestions has a low probability that the cumulative peptide-to-protein distribution is the same as that of random or noise false positive spectra.  相似文献   

2.
Identification of proteins by MS/MS is performed by matching experimental mass spectra against calculated spectra of all possible peptides in a protein data base. The search engine assigns each spectrum a score indicating how well the experimental data complies with the expected one; a higher score means increased confidence in the identification. One problem is the false-positive identifications, which arise from incomplete data as well as from the presence of misleading ions in experimental mass spectra due to gas-phase reactions, stray ions, contaminants, and electronic noise. We employed a novel technique of reduction of false positives that is based on a combined use of orthogonal fragmentation techniques electron capture dissociation (ECD) and collisionally activated dissociation (CAD). Since ECD and CAD exhibit many complementary properties, their combined use greatly increased the analysis specificity, which was further strengthened by the high mass accuracy (approximately 1 ppm) afforded by Fourier transform mass spectrometry. The utility of this approach is demonstrated on a whole cell lysate from Escherichia coli. Analysis was made using the data-dependent acquisition mode. Extraction of complementary sequence information was performed prior to data base search using in-house written software. Only masses involved in complementary pairs in the MS/MS spectrum from the same or orthogonal fragmentation techniques were submitted to the data base search. ECD/CAD identified twice as many proteins at a fixed statistically significant confidence level with on average a 64% higher Mascot score. The confidence in protein identification was hereby increased by more than 1 order of magnitude. The combined ECD/CAD searches were on average 20% faster than CAD-only searches. A specially developed test with scrambled MS/MS data revealed that the amount of false-positive identifications was dramatically reduced by the combined use of CAD and ECD.  相似文献   

3.
Microcolumn RPLC (μRPLC) is one of the optimum separation modes for shotgun proteomic analysis. To identify as many proteins as possible by MS/MS, the improvement on separation efficiency and peak capacity of μRPLC is indispensable. Although the increase in column length is one of the effective solutions, the preparation of a long microcolumn is rather difficult due to the high backpressure generated during the packing procedure. In our recent work, through connecting microcolumns of 5, 10, and 15 cm length via unions with minimal dead volume, long microcolumns with length up to 30 cm were obtained, with which 318 proteins were identified from proteins extracted from Escherichia coli by μRPLC‐ESI MS/MS, and similar distributions of Mw and pI were found with single and various coupled microcolumns. Furthermore, by using MS/MS with improved sensitivity, with such a serially coupled 30 cm long microcolumn, 1692 proteins were identified within 7 h from rat brain tissue, with false positive rate (FPR) <1%. All these results demonstrated that serially couple microcolumns might be of great promising to improve the separation capacity of μRPLC in shotgun proteomic analysis.  相似文献   

4.
The Escherichia coli proteome was digested with trypsin and fractionated using SPE on a C18 SPE column. Seven fractions were collected and analyzed by CZE‐ESI‐MS/MS. The separation was performed in a 60‐cm‐long linear polyacrylamide‐coated capillary with a 0.1% v/v formic acid separation buffer. An electrokinetic sheath‐flow electrospray interface was used to couple the separation capillary with an Orbitrap‐Velos operating in higher‐energy collisional dissociation mode. Each CZE‐ESI‐MS/MS run lasted 50 min and total MS time was 350 min. A total of 23 706 peptide spectra matches, 4902 peptide IDs, and 871 protein group IDs were generated using MASCOT with false discovery rate less than 1% on the peptide level. The total mass spectrometer analysis time was less than 6 h, the sample identification rate (145 proteins/h) was more than two times higher than previous studies of the E. coli proteome, and the amount of sample consumed (<1 μg) was roughly fourfold less than previous studies. These results demonstrate that CZE is a useful tool for the bottom‐up analysis of prokaryote proteomes.  相似文献   

5.
The proteins secreted by prostate cancer cells (PC3(AR)6) were separated by strong anion exchange chromatography, digested with trypsin and analyzed by unbiased liquid chromatography tandem mass spectrometry with an ion trap. The spectra were matched to peptides within proteins using a goodness of fit algorithm that showed a low false positive rate. The parent ions for MS/MS were randomly and independently sampled from a log-normal population and therefore could be analyzed by ANOVA. Normal distribution analysis confirmed that the parent and fragment ion intensity distributions were sampled over 99.9% of their range that was above the background noise. Arranging the ion intensity data with the identified peptide and protein sequences in structured query language (SQL) permitted the quantification of ion intensity across treatments, proteins and peptides. The intensity of 101,905 fragment ions from 1421 peptide precursors of 583 peptides from 233 proteins separated over 11 sample treatments were computed together in one ANOVA model using the statistical analysis system (SAS) prior to Tukey-Kramer honestly significant difference (HSD) testing. Thus complex mixtures of proteins were identified and quantified with a high degree of confidence using an ion trap without isotopic labels, multivariate analysis or comparing chromatographic retention times.  相似文献   

6.
Several academic software are available to help the validation and reporting of proteomics data generated by MS analyses. However, to our knowledge, none of them have been conceived to meet the particular needs generated by the study of organisms whose genomes are not sequenced. In that context, we have developed OVNIp, an open‐source application which facilitates the whole process of proteomics results interpretation. One of its unique attributes is its capacity to compile multiple results (from several search engines and/or several databank searches) with a resolution of conflicting interpretations. Moreover, OVNIp enables automated exploitation of de novo sequences generated from unassigned MS/MS spectra leading to higher sequence coverage and enhancing confidence in the identified proteins. The exploitation of these additional spectra might also identify novel proteins through a MS‐BLAST search, which can be easily ran from the OVNIp interface. Beyond this primary scope, OVNIp can also benefit to users who look for a simple standalone application to both visualize and confirm MS/MS result interpretations through a simple graphical interface and generate reports according to user‐defined forms which may integrate the prerequisites for publication. Sources, documentation and a stable release for Windows are available at http://wwwappli.nantes.inra.fr:8180/OVNIp .  相似文献   

7.
Mass spectrometry data are often corrupted by noise. It is very difficult to simultaneously detect low-abundance peaks and reduce false-positive peak detection caused by noise. In this paper, we propose to improve peak detection using an additional constraint: the consistent appearance of similar true peaks across multiple spectra. We observe that false -positive peaks in general do not repeat themselves well across multiple spectra. When we align all the identified peaks (including false-positive ones) from multiple spectra together, those false-positive peaks are not as consistent as true peaks. Thus, we propose to use information from other spectra in order to reduce false-positive peaks. The new method improves the detection of peaks over the traditional single spectrum based peak detection methods. Consequently, the discovery of cancer biomarkers also benefits from this improvement. Source code and additional data are available at: http://www.ece.ust.hk/ approximately eeyu/mspeak.htm.  相似文献   

8.
【目的】基质辅助激光解吸电离飞行时间质谱(MALDI-TOF MS)法基于微生物的特征蛋白指纹图谱鉴定菌种,本研究利用基因组学和MALDI-TOFMS技术鉴定放线菌纲细菌的核糖体蛋白质标志物。【方法】从MALDI-TOF MS图谱数据库选取放线菌纲代表菌种,在基因组数据库检索目标菌种,获取目标菌株或其参比菌株的核糖体蛋白质序列,计算获得分子质量理论值,用于注释目标菌株MALDI-TOFMS指纹图谱中的核糖体蛋白质信号。【结果】从8目,24科,53属,114种,142株放线菌的MALDI-TOFMS图谱中总共注释出31种核糖体蛋白质。各菌株的指纹图谱中核糖体蛋白质信号数量差异显著。各种核糖体蛋白质信号的注释次数差异显著。总共15种核糖体蛋白质在超过半数图谱中得到注释,注释次数最高的是核糖体大亚基蛋白质L36。【结论】本研究找到了放线菌纲细菌MALDI-TOF MS图谱中常见的15种核糖体蛋白质信号,可为通过识别核糖体蛋白质的质谱特征峰鉴定放线菌的方法建立提供依据。  相似文献   

9.
Ammonium cationisation has been used for taxoid profiling of partially purified methanolic extracts of needles of Taxus wallichiana growing in different regions of the Himalayas (Kashmir, Himachal Pradesh, UP Hills, Darjeeling, Sikkim and Arunachal Pradesh) by electrospray ionisation tandem mass spectrometry (MS/MS). The MS/MS spectra of the [M + NH4]+ or [M + H]+ ions gave structurally diagnostic fragment ions which revealed information about the taxane skeleton as well as the number and nature of the substituents. The rearranged 11(15-->1)-abeo-taxanes showed a characteristic elimination of the hydroxyisopropyl group with an acetoxy/benzoyloxy group from C-9. The identification of the taxoids was achieved by comparison of the MS/MS spectra with those of authentic taxoids or was based on biogenetic grounds. The results were corroborated by liquid chromatography-MS analysis. Out of the 50 taxoids identified, 21 belonged to the rearranged class. The presence of paclitaxel in the samples from four regions was confirmed: the study also revealed the occurrence of several basic taxoids in these samples. MS/MS profiling by electrospray ionisation was shown to be a fast and reliable technique for the analysis of taxoid samples.  相似文献   

10.
Proteomic data from embryos are essential for the completion of whole proteome catalog due to embryo‐specific expression of certain proteins. In this study, using reverse phase LC‐MS/MS combined with 1‐D SDS‐PAGE, we identified 1625 mammalian and 735 Sus scrofa proteins from porcine zygotes that included both cytosolic and membranous proteins. We also found that the global protein profiles of parthenogenetically activated (PA) and in vitro fertilized (IVF) zygotes were similar but differences in expression of individual proteins were also evident. These differences were not due to culture conditions, polyspermy or non‐activation of oocytes, as the same culture method was used in both groups, the frequency of polyspermy was 24.3±3.0% and the rates of oocyte activation did not differ (p>0.05) between PA and IVF embryos. Consistent with proteomic data, fluorescent Hoechst 33 342 staining and terminal deoxynucleotidyl transferase dUTP nick end labeling assay also revealed that PA embryos were of poor quality as they contained less cells per blastocyst and were more predisposed to apoptosis (p<0.05), although their in vitro development rates were similar. To our knowledge, this is the first report on global peptide sequencing and quantification of protein in PA and IVF embryos by LC‐MS/MS that may be useful as a reference map for future studies.  相似文献   

11.
Ahrné E  Ohta Y  Nikitin F  Scherl A  Lisacek F  Müller M 《Proteomics》2011,11(20):4085-4095
The relevance of libraries of annotated MS/MS spectra is growing with the amount of proteomic data generated in high-throughput experiments. These reference libraries provide a fast and accurate way to identify newly acquired MS/MS spectra. In the context of multiple hypotheses testing, the control of the number of false-positive identifications expected in the final result list by means of the calculation of the false discovery rate (FDR). In a classical sequence search where experimental MS/MS spectra are compared with the theoretical peptide spectra calculated from a sequence database, the FDR is estimated by searching randomized or decoy sequence databases. Despite on-going discussion on how exactly the FDR has to be calculated, this method is widely accepted in the proteomic community. Recently, similar approaches to control the FDR of spectrum library searches were discussed. We present in this paper a detailed analysis of the similarity between spectra of distinct peptides to set the basis of our own solution for decoy library creation (DeLiberator). It differs from the previously published results in some key points, mainly in implementing new methods that prevent decoy spectra from being too similar to the original library spectra while keeping important features of real MS/MS spectra. Using different proteomic data sets and library creation methods, we evaluate our approach and compare it with alternative methods.  相似文献   

12.
A common problem encountered when performing large‐scale MS proteome analysis is the loss of information due to the high percentage of unassigned spectra. To determine the causes behind this loss we have analyzed the proteome of one of the smallest living bacteria that can be grown axenically, Mycoplasma pneumoniae (729 ORFs). The proteome of M. pneumoniae cells, grown in defined media, was analyzed by MS. An initial search with both Mascot and a species‐specific NCBInr database with common contaminants (NCBImpn), resulted in around 79% of the acquired spectra not having an assignment. The percentage of non‐assigned spectra was reduced to 27% after re‐analysis of the data with the PEAKS software, thereby increasing the proteome coverage of M. pneumoniae from the initial 60% to over 76%. Nonetheless, 33 413 spectra with assigned amino acid sequences could not be mapped to any NCBInr database protein sequence. Approximately, 1% of these unassigned peptides corresponded to PTMs and 4% to M. pneumoniae protein variants (deamidation and translation inaccuracies). The most abundant peptide sequence variants (Phe‐Tyr and Ala‐Ser) could be explained by alterations in the editing capacity of the corresponding tRNA synthases. About another 1% of the peptides not associated to any protein had repetitions of the same aromatic/hydrophobic amino acid at the N‐terminus, or had Arg/Lys at the C‐terminus. Thus, in a model system, we have maximized the number of assigned spectra to 73% (51 453 out of the 70 040 initial acquired spectra). All MS data have been deposited in the ProteomeXchange with identifier PXD002779 ( http://proteomecentral.proteomexchange.org/dataset/PXD002779 ).  相似文献   

13.
Peak detection is a pivotal first step in biomarker discovery from MS data and can significantly influence the results of downstream data analysis steps. We developed a novel automatic peak detection method for prOTOF MS data, which does not require a priori knowledge of protein masses. Random noise is removed by an undecimated wavelet transform and chemical noise is attenuated by an adaptive short‐time discrete Fourier transform. Isotopic peaks corresponding to a single protein are combined by extracting an envelope over them. Depending on the S/N, the desired peaks in each individual spectrum are detected and those with the highest intensity among their peak clusters are recorded. The common peaks among all the spectra are identified by choosing an appropriate cut‐off threshold in the complete linkage hierarchical clustering. To remove the 1 Da shifting of the peaks, the peak corresponding to the same protein is determined as the detected peak with the largest number among its neighborhood. We validated this method using a data set of serial peptide and protein calibration standards. Compared with MoverZ program, our new method detects more peaks and significantly enhances S/N of the peak after the chemical noise removal. We then successfully applied this method to a data set from prOTOF MS spectra of albumin and albumin‐bound proteins from serum samples of 59 patients with carotid artery disease compared to vascular disease‐free patients to detect peaks with S/N≥2. Our method is easily implemented and is highly effective to define peaks that will be used for disease classification or to highlight potential biomarkers.  相似文献   

14.
LC-MS/MS analysis on a linear ion trap LTQ mass spectrometer, combined with data processing, stringent, and sequence-similarity database searching tools, was employed in a layered manner to identify proteins in organisms with unsequenced genomes. Highly specific stringent searches (MASCOT) were applied as a first layer screen to identify either known (i.e. present in a database) proteins, or unknown proteins sharing identical peptides with related database sequences. Once the confidently matched spectra were removed, the remainder was filtered against a nonannotated library of background spectra that cleaned up the dataset from spectra of common protein and chemical contaminants. The rectified spectral dataset was further subjected to rapid batch de novo interpretation by PepNovo software, followed by the MS BLAST sequence-similarity search that used multiple redundant and partially accurate candidate peptide sequences. Importantly, a single dataset was acquired at the uncompromised sensitivity with no need of manual selection of MS/MS spectra for subsequent de novo interpretation. This approach enabled a completely automated identification of novel proteins that were, otherwise, missed by conventional database searches.  相似文献   

15.
Abstract

The status of 13 trace elements’ (both essential and toxic) was investigated in individual parts of the winter wheat plant(Triticum aestivum) taken during its whole cultivation period. The study includes the determination of total concentrations, portions soluble in 0.02 mol L?1 Tris-HCI buffer solution (pH = 7.5), and the fractionation of soluble species of elements by SEC and ICP/MS. Ligands of trace elements from a low-molecular weight SEC fraction were isolated by affinity chromatography and characterised by MALDI/MS analyses and by amino acids composition. Inhomogeneous accumulation of trace elements was found in the analysed plant tissues. The concentrations of elements are also affected by the maturity of the plants. The distribution of the soluble species of the elements between chromatographic fractions exhibited some regularity in all the samples. Substantial amounts of trace elements are located in a low-molecular weight fraction (< 2 kDa). Only chromatograms of Zn (grain) and Cu (all samples) contain significant medium-molecular and high-molecular weight fractions. Compounds isolated from the low-molecular weight fractions are rich in cystein and dicarboxylic amino acids or their amides. MALDI/MS spectra of these compounds isolated from shoots, straw and grain confirmed the presence of the phytochelatin PC5.  相似文献   

16.
MassMatrix is a program that matches tandem mass spectra with theoretical peptide sequences derived from a protein database. The program uses a mass accuracy sensitive probabilistic score model to rank peptide matches. The MS/MS search software was evaluated by use of a high mass accuracy dataset and its results compared with those from MASCOT, SEQUEST, X!Tandem, and OMSSA. For the high mass accuracy data, MassMatrix provided better sensitivity than MASCOT, SEQUEST, X!Tandem, and OMSSA for a given specificity and the percentage of false positives was 2%. More importantly all manually validated true positives corresponded to a unique peptide/spectrum match. The presence of decoy sequence and additional variable PTMs did not significantly affect the results from the high mass accuracy search. MassMatrix performs well when compared with MASCOT, SEQUEST, X!Tandem, and OMSSA with regard to search time. MassMatrix was also run on a distributed memory clusters and achieved search speeds of ~100 000 spectra per hour when searching against a complete human database with eight variable modifications. The algorithm is available for public searches at http://www.massmatrix.net.  相似文献   

17.
Pulmonary tuberculosis (TB) caused by Mycobacterium tuberculosis is a chronic disease. Currently, there are no sufficiently validated biomarkers for early diagnosis of TB infection. In this study, a panel of potential serum biomarkers was identified between patients with pulmonary TB and healthy controls by using iTRAQ‐coupled 2D LC‐MS/MS technique. Among 100 differentially expressed proteins screened, 45 proteins were upregulated (>1.25‐fold at p < 0.05) and 55 proteins were downregulated (<0.8‐fold at p < 0.05) in the TB serum. Bioinformatics analysis revealed that the differentially expressed proteins were related to the response to stimulus, the metabolic and immune system processes. The significantly differential expression of apolipoprotein CII (APOCII), CD5 antigen‐like (CD5L), hyaluronan‐binding protein 2 (HABP2), and retinol‐binding protein 4 (RBP4) was further confirmed using immunoblotting and ELISA analysis. By forward stepwise multivariate regression analysis, a panel of serum biomarkers including APOCII, CD5L, and RBP4 was obtained to form the disease diagnostic model. The receiver operation characteristic curve of the diagnostic model was 0.98 (sensitivity = 93.42%, specificity = 92.86%). In conclusion, APOCII, CD5L, HABP2, and RBP4 may be potential protein biomarkers of pulmonary TB. Our research provides useful data for early diagnosis of TB.  相似文献   

18.
Peptide mass fingerprinting (PMF) is a valuable method for rapid and high-throughput protein identification using the proteomics approach. Automated search engines, such as Ms-Fit, Mascot, ProFound, and Peptldent, have facilitated protein identification through PMF. The potential to obtain a true MS protein identification result depends on the choice of algorithm as well as experimental factors that influence the information content in MS data. When mass spectral data are incomplete and/or have low mass accuracy, the “number of matches” approach may be inadequate for a useful identification. Several studies have evaluated factors influencing the quality of mass spectrometry (MS) experiments. Missed cleavages, posttranslational modifications of peptides and contaminants (e.g., keratin) are important factors that can affect the results of MS analyses by influencing the identification process as well as the quality of the MS spectra. We compared search engines frequently used to identify proteins fromHomo sapiens andHalobacterium salinarum by evaluating factors, including data-based and mass tolerance to develop an improved search engine for PMF. This study may provide information to help develop a more effective algorithm for protein identification in each species through PMF.  相似文献   

19.
Telocytes (TCs) are described as a particular type of cells of the interstitial space ( www.telocytes.com ). Their main characteristics are the very long telopodes with alternating podoms and podomers. Recently, we performed a comparative proteomic analysis of human lung TCs with fibroblasts, demonstrating that TCs are clearly a distinct cell type. Therefore, the present study aims to reinforce this idea by comparing lung TCs with endothelial cells (ECs), since TCs and ECs share immunopositivity for CD34. We applied isobaric tag for relative and absolute quantification (iTRAQ) combined with automated 2‐D nano‐ESI LC‐MS/MS to analyse proteins extracted from TCs and ECs in primary cell cultures. In total, 1609 proteins were identified in cell cultures. 98 proteins (the 5th day), and 82 proteins (10th day) were confidently quantified (screened by two‐sample t‐test, P < 0.05) as up‐ or down‐regulated (fold change >2). We found that in TCs there are 38 up‐regulated proteins at the 5th day and 26 up‐regulated proteins at the 10th day. Bioinformatics analysis using Panther revealed that the 38 proteins associated with TCs represented cellular functions such as intercellular communication (via vesicle mediated transport) and structure morphogenesis, being mainly cytoskeletal proteins and oxidoreductases. In addition, we found 60 up‐regulated proteins in ECs e.g.: cell surface glycoprotein MUC18 (15.54‐fold) and von Willebrand factor (5.74‐fold). The 26 up‐regulated proteins in TCs at 10th day, were also analysed and confirmed the same major cellular functions, while the 56 down‐regulated proteins confirmed again their specificity for ECs. In conclusion, we report here the first extensive comparison of proteins from TCs and ECs using a quantitative proteomics approach. Our data show that TCs are completely different from ECs. Protein expression profile showed that TCs play specific roles in intercellular communication and intercellular signalling. Moreover, they might inhibit the oxidative stress and cellular ageing and may have pro‐proliferative effects through the inhibition of apoptosis. The group of proteins identified in this study needs to be explored further for the role in pathogenesis of lung disease.  相似文献   

20.
The peptide‐based quantitation accuracy and precision of LC‐ESI (QSTAR Elite) and LC‐MALDI (4800 MALDI TOF/TOF) were compared by analyzing identical Escherichia coli tryptic digests containing iTRAQ‐labeled peptides of defined abundances (1:1, 2.5:1, 5:1, and 10:1). Only 51.4% of QSTAR spectra were used for quantitation by ProteinPilot Software versus 66.7% of LC‐MALDI spectra. The average protein sequence coverages for LC‐ESI and LC‐MALDI were 24.0 and 18.2% (14.9 and 8.4 peptides per protein), respectively. The iTRAQ‐based expression ratios determined by ProteinPilot from the 57 467 ESI‐MS/MS and 26 085 MALDI‐MS/MS spectra were analyzed for measurement accuracy and reproducibility. When the relative abundances of peptides within a sample were increased from 1:1 to 10:1, the mean ratios calculated on both instruments differed by only 0.7–6.7% between platforms. In the 10:1 experiment, up to 64.7% of iTRAQ ratios from LC‐ESI MS/MS spectra failed S/N thresholds and were excluded from quantitation, while only 0.1% of the equivalent LC‐MALDI iTRAQ ratios were rejected. Re‐analysis of an archived LC‐MALDI sample set stored for 5 months generated 3715 MS/MS spectra for quantitation, compared with 3845 acquired originally, and the average ratios differed by only 3.1%. Overall, MS/MS‐based peptide quantitation performance of offline LC‐MALDI was comparable with on‐line LC‐ESI, which required threefold less time. However, offline LC‐MALDI allows the re‐analysis of archived HPLC‐separated samples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号