首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 33 毫秒
1.
In its fourth year, the CASMI 2016 contest was organized to evaluate current chemical structure identification strategies for 19 natural products using high-resolution LC–MS and LC–MS/MS challenge datasets using automated methods with or without the combination of other tools. These natural products originate from plants, fungi, marine sponges, algae, or micro-algae. Every compound annotation workflow must start with determination of elemental compositions. Of these 19 challenges, one was excluded by the organizers after submission. For the remaining 18 challenges, three software programs were used. MS-FINDER version 1.62 was able to correctly identify 89% of the molecular formulas using an internal database that comprised of 13 metabolomics repositories with 45,181 formulas. SIRIUS correctly identified 61% compositions using PubChem formulas and Seven Golden Rules correctly identified 83% by using the Dictionary of Natural Products as a targeted database. Next, we performed structural dereplication for which we used the consensus formula from the three software programs. We submitted two solution sets for these challenges. In the first solution set, avaniya001, we only used the internal MS-FINDER functions for predicting and ranking structures, correctly identifying 53% of the structures as top-hit, 72% within the top-3 structures, and 78% within the top-10 hits. For our second set, avaniya002, we used both MS-FINDER predictions as well as MS/MS queries against the commercial NIST 14, METLIN, and the public MassBank of North America libraries. Here we correctly identified 78% of the structures as top-hit and 83% within the top-3 hits. Three challenge spectra remained unidentified in either of our submissions within the top-10 hits.  相似文献   

2.
Liquid chromatography–mass spectrometry (LC–MS) is a commonly used analytical platform for non-targeted metabolite profiling experiments. Although data acquisition, processing and statistical analyses are almost routine in such experiments, further annotation and subsequent identification of chemical compounds are not. For identification, tandem mass spectra provide valuable information towards the structure of chemical compounds. These are typically acquired online, in data-dependent mode, or offline, using handcrafted acquisition methods and manually extracted from raw data. Here, we present several methods to fast-track and improve both the acquisition and processing of LC–MS/MS data. Our nearly online (nearline) data-dependent tandem MS strategy creates a minimal set of LC–MS/MS acquisition methods for relevant features revealed by a preceding non-targeted profiling experiment. Using different filtering criteria, such as intensity or ion type, the acquisition of irrelevant spectra is minimized. Afterwards, LC–MS/MS raw data are processed with feature detection and grouping algorithms. The extracted tandem mass spectra can be used for both library search and de-novo identification methods. The algorithms are implemented in the R package MetShot and support the export to Bruker, Agilent or Waters QTOF instruments and the vendor-independent TraML standard. We evaluate the performance of our workflow on a Bruker micrOTOF-Q by comparison of automatically acquired and extracted tandem mass spectra obtained from a mixture of natural product standards against manually extracted reference spectra. Using Arabidopsis thaliana wild-type and biosynthetic gene knockout plants, we characterize the metabolic products of a biosynthetic pathway and demonstrate the integration of our approach into a typical non-targeted metabolite profiling workflow.  相似文献   

3.

Liquid chromatography–mass spectrometry (LC–MS) is a commonly used analytical platform for non-targeted metabolite profiling experiments. Although data acquisition, processing and statistical analyses are almost routine in such experiments, further annotation and subsequent identification of chemical compounds are not. For identification, tandem mass spectra provide valuable information towards the structure of chemical compounds. These are typically acquired online, in data-dependent mode, or offline, using handcrafted acquisition methods and manually extracted from raw data. Here, we present several methods to fast-track and improve both the acquisition and processing of LC–MS/MS data. Our nearly online (nearline) data-dependent tandem MS strategy creates a minimal set of LC–MS/MS acquisition methods for relevant features revealed by a preceding non-targeted profiling experiment. Using different filtering criteria, such as intensity or ion type, the acquisition of irrelevant spectra is minimized. Afterwards, LC–MS/MS raw data are processed with feature detection and grouping algorithms. The extracted tandem mass spectra can be used for both library search and de-novo identification methods. The algorithms are implemented in the R package MetShot and support the export to Bruker, Agilent or Waters QTOF instruments and the vendor-independent TraML standard. We evaluate the performance of our workflow on a Bruker micrOTOF-Q by comparison of automatically acquired and extracted tandem mass spectra obtained from a mixture of natural product standards against manually extracted reference spectra. Using Arabidopsis thaliana wild-type and biosynthetic gene knockout plants, we characterize the metabolic products of a biosynthetic pathway and demonstrate the integration of our approach into a typical non-targeted metabolite profiling workflow.

  相似文献   

4.
Time-consuming and experience-dependent manual validations of tandem mass spectra are usually applied to SEQUEST results. This inefficient method has become a significant bottleneck for MS/MS data processing. Here we introduce a program AMASS (advanced mass spectrum screener), which can filter the tandem mass spectra of SEQUEST results by measuring the match percentage of high-abundant ions and the continuity of matched fragment ions in b, y series. Compared with Xcorr and DeltaCn filter, AMASS can increase the number of positives and reduce the number of negatives in 22 datasets generated from 18 known protein mixtures. It effectively removed most noisy spectra, false interpretations, and about half of poor fragmentation spectra, and AMASS can work synergistically with Rscore filter. We believe the use of AMASS and Rscore can result in a more accurate identification of peptide MS/MS spectra and reduce the time and energy for manual validation.  相似文献   

5.
Human urine proteome analysis by three separation approaches   总被引:3,自引:0,他引:3  
Sun W  Li F  Wu S  Wang X  Zheng D  Wang J  Gao Y 《Proteomics》2005,5(18):4994-5001
The urinary proteome is known to be a valuable field of study related to organ functions. There have been several extensive urine proteome studies. However, the overlapping rate among different studies is relatively low. Whether the low overlapping rate was caused by different sample sources, preparation, separation and identification methods is unknown. Moreover, low molecular mass (<10 kDa) proteins have not been studied extensively. In this report, male and female pooled urine samples were collected from healthy volunteers. The urinary proteins were acetone precipitated, separated and identified by three approaches, 1-DE plus 1-D LC/MS/MS, direct 1-D LC/MS/MS and 2-D LC/MS/MS. 1-D tricine gels were used to separate low molecular mass proteins. The tandem mass spectra of positive identifications were quality controlled both by manual validation and using advanced mass spectrum scanner software. A total of 226 urinary proteins were identified; 171 proteins were identified by proteomics approach for the first time, including 4 male-specific proteins. Twelve low molecular mass proteins were identified. Most urinary proteins had a molecular mass between 30 and 60 kDa and a pI between 4 and 10. The apparent molecular masses of many proteins were different from theoretical ones, which indicated their post-translational modification and degradation. The effects of sample preparation, separation and identification methods on the overlapping rate of different experiments are discussed.  相似文献   

6.
A rapid and systematic strategy based on liquid chromatography–mass spectrometry (LC–MS) profiling and liquid chromatography–tandem mass spectrometry (LC–MS–MS) substructural techniques was utilized to elucidate the degradation products of paclitaxel, the active ingredient in Taxol. This strategy integrates, in a single instrumental approach, analytical HPLC, UV detection, full-scan electrospray MS, and MS–MS to rapidly and accurately elucidate structures of impurities and degradants. In these studies, degradants induced by acid, base, peroxide, and light were profiled using LC–MS and LC–MS–MS methodologies resulting in an LC–MS degradant database which includes information on molecular structures, chromatographic behavior, molecular mass, and MS–MS substructural information. The stressing conditions which may cause drug degradation are utilized to validate the analytical monitoring methods and serve as predictive tools for future formulation and packaging studies. Degradation products formed upon exposure to basic conditions included baccatin III, paclitaxel sidechain methyl ester, 10-deacetylpaclitaxel, and 7-epipaclitaxel. Degradation products formed upon exposure to acidic conditions included 10-deacetylpaclitaxel and the oxetane ring opened product. Treatment with hydrogen peroxide produced only 10-deacetylpaclitaxel. Exposure to high intensity light produced a number of degradants. The most abundant photodegradant of paclitaxel corresponded to an isomer which contains a C3–C11 bridge. These methodologies are applicable at any stage of the drug product cycle from discovery through development. This library of paclitaxel degradants provides a foundation for future development work regarding product monitoring, as well as use as a diagnostic tool for new degradation products.  相似文献   

7.
Spectral similarity is used as a proxy for structural similarity in many tandem mass spectrometry (MS/MS) based metabolomics analyses such as library matching and molecular networking. Although weaknesses in the relationship between spectral similarity scores and the true structural similarities have been described, little development of alternative scores has been undertaken. Here, we introduce Spec2Vec, a novel spectral similarity score inspired by a natural language processing algorithm—Word2Vec. Spec2Vec learns fragmental relationships within a large set of spectral data to derive abstract spectral embeddings that can be used to assess spectral similarities. Using data derived from GNPS MS/MS libraries including spectra for nearly 13,000 unique molecules, we show how Spec2Vec scores correlate better with structural similarity than cosine-based scores. We demonstrate the advantages of Spec2Vec in library matching and molecular networking. Spec2Vec is computationally more scalable allowing structural analogue searches in large databases within seconds.  相似文献   

8.
Exemestane is an irreversible aromatase inhibitor used for anticancer therapy. Unfortunately, this drug is also misused in sports to avoid some adverse effects caused by steroids administration. For this reason exemestane has been included in World Anti-Doping Agency prohibited list. Usually, doping control laboratories monitor prohibited substances through their metabolites, because parent compounds are readily metabolized. Thus metabolism studies of these substances are very important. Metabolism of exemestane in humans is not clearly reported and this drug is detected indirectly through analysis of its only known metabolite: 17β-hydroxyexemestane using liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) and gas chromatography coupled to mass spectrometry (GC-MS). This drug is extensively metabolized to several unknown oxidized metabolites. For this purpose LC-MS/MS has been used to propose new urinary exemestane metabolites, mainly oxidized in C6-exomethylene and simultaneously reduced in 17-keto group. Urine samples from four volunteers obtained after administration of a 25mg dose of exemestane were analyzed separately by LC-MS/MS. Urine samples of each volunteer were hydrolyzed followed by liquid-liquid extraction and injected into a LC-MS/MS system. Three unreported metabolites were detected in all urine samples by LC-MS/MS. The postulated structures of the detected metabolites were based on molecular formulae composition obtained through high accuracy mass determination by liquid chromatography coupled to hybrid quadrupole-time of flight mass spectrometry (LC-QTOF MS) (all mass errors below 2ppm), electrospray (ESI) product ion spectra and chromatographic behavior.  相似文献   

9.
To determine the protein content of formula, gel electrophoresis was performed on the infant formula samples and the entire protein patterns were analyzed by nano-high performance liquid chromatography-electrospray tandem mass spectrometry (nano-HPLC/ESI/MS/MS). From the commercial infant formula profiled in this study, a total of 154 peptides, corresponding to 31 unique proteins were identified by nano-HPLC/ESI/MS/MS. Each of the identified peptides was reconfirmed by a strict integrated approach using tandem mass spectra. This protein profiling method using gel electrophoresis coupled with nano-HPLC/ESI/MS/MS and manual evaluation is a sensitive and accurate method for protein identification as well as a powerful tool for monitoring various types of food products.  相似文献   

10.
The further development of derivatizing reagents for plasma amino acid quantification by tandem mass spectrometry is described. The succinimide ester of 4-methylpiperazineacetic acid (MPAS), the iTRAQ reagent, was systematically modified to improve tandem mass spectrometer (MS/MS) product ion intensity. 4-Methylpiperazinebutyryl succinimide (MPBS) and dimethylaminobutyryl succinimide (DMABS) afforded one to two orders of magnitude greater MS/MS product ion signal intensity than the MPAS derivative for simple amino acids. CD(3) analogues of the modified derivatizing reagents were evaluated for preparation of amino acid isotope-labelled quantifying standards. Acceptable accuracy and precision was obtained with d(3)-DMABS as the amino acid standards derivatizing reagent. The product ion spectra of the DMABS amino acid derivatives are diagnostic for structural isomers including valine/norvaline, alanine/sarcosine and leucine/isoleucine. Improved analytical sensitivity and specificity afforded by these derivatives may help to establish liquid chromatography tandem mass spectrometry (LC-MS/MS) with derivatization generated isotope-labelled standards a viable alternative to amino acids analysers.  相似文献   

11.
Peptide natural products show broad biological properties and are commonly produced by orthogonal ribosomal and nonribosomal pathways in prokaryotes and eukaryotes. To harvest this large and diverse resource of bioactive molecules, we introduce here natural product peptidogenomics (NPP), a new MS-guided genome-mining method that connects the chemotypes of peptide natural products to their biosynthetic gene clusters by iteratively matching de novo tandem MS (MS(n)) structures to genomics-based structures following biosynthetic logic. In this study, we show that NPP enabled the rapid characterization of over ten chemically diverse ribosomal and nonribosomal peptide natural products of previously unidentified composition from Streptomycete bacteria as a proof of concept to begin automating the genome-mining process. We show the identification of lantipeptides, lasso peptides, linardins, formylated peptides and lipopeptides, many of which are from well-characterized model Streptomycetes, highlighting the power of NPP in the discovery of new peptide natural products from even intensely studied organisms.  相似文献   

12.
De novo peptide sequencing via tandem mass spectrometry.   总被引:10,自引:0,他引:10  
Peptide sequencing via tandem mass spectrometry (MS/MS) is one of the most powerful tools in proteomics for identifying proteins. Because complete genome sequences are accumulating rapidly, the recent trend in interpretation of MS/MS spectra has been database search. However, de novo MS/MS spectral interpretation remains an open problem typically involving manual interpretation by expert mass spectrometrists. We have developed a new algorithm, SHERENGA, for de novo interpretation that automatically learns fragment ion types and intensity thresholds from a collection of test spectra generated from any type of mass spectrometer. The test data are used to construct optimal path scoring in the graph representations of MS/MS spectra. A ranked list of high scoring paths corresponds to potential peptide sequences. SHERENGA is most useful for interpreting sequences of peptides resulting from unknown proteins and for validating the results of database search algorithms in fully automated, high-throughput peptide sequencing.  相似文献   

13.
A novel derivatization method employing 1,2-dimethylimidazole-4-sulfonyl chloride (DMISC) to improve the mass spectrometric response for phenolic compounds in liquid chromatography electrospray ionization mass spectrometry (LC-ESI-MS) and tandem mass spectrometry (LC-ESI-MS/MS) is described. Several environmentally relevant compounds, including chloro-, aryl- and alkylphenols, steroidal estrogens, and hydroxy-polycyclic aromatic hydrocarbons (OHPAHs), were selected to evaluate this technique. A facile derivatization procedure employing DMISC results in dimethylimidazolesulfonyl (DMIS) derivatives that are stable in aqueous solution. These DMIS derivatives produced intense [M+H](+) ions in positive-ion LC-ESI-MS. The product ion spectra of the [M+H](+) ions of simple phenols were dominated by ions representing the DMIS and dimethylimidazole moieties, whereas product ion spectra of the DMIS derivatives of OHPAHs with three or more fused aromatic rings showed prominent ArO(+) ions, the relative intensity of which increased with the number of rings. The DMIS derivatives of the selected phenolic compounds showed excellent chromatographic properties. To substantiate the utility of derivatization with DMISC, an analytical method employing enzyme hydrolysis, solid phase extraction, derivatization with DMISC, and analysis by LC-ESI-MS/MS with multiple reaction monitoring for determination in human urine of 1-hydroxypyrene, a widely used biomarker for the assessment of human exposure to PAHs, was developed and validated.  相似文献   

14.
We developed a probability-based machine-learning program, Colander, to identify tandem mass spectra that are highly likely to represent phosphopeptides prior to database search. We identified statistically significant diagnostic features of phosphopeptide tandem mass spectra based on ion trap CID MS/MS experiments. Statistics for the features are calculated from 376 validated phosphopeptide spectra and 376 nonphosphopeptide spectra. A probability-based support vector machine (SVM) program, Colander, was then trained on five selected features. Data sets were assembled both from LC/LC-MS/MS analyses of large-scale phosphopeptide enrichments from proteolyzed cells, tissues and synthetic phosphopeptides. These data sets were used to evaluate the capability of Colander to select pS/pT-containing phosphopeptide tandem mass spectra. When applied to unknown tandem mass spectra, Colander can routinely remove 80% of tandem mass spectra while retaining 95% of phosphopeptide tandem mass spectra. The program significantly reduced computational time spent on database search by 60-90%. Furthermore, prefiltering tandem mass spectra representing phosphopeptides can increase the number of phosphopeptide identifications under a predefined false positive rate.  相似文献   

15.
Human apolipoprotein B100 (apoB100) has 19 potential N-glycosylation sites, and 16 asparagine residues were reported to be occupied by high-mannose type, hybrid type, and monoantennary and biantennary complex type oligosaccharides. In the present study, a site-specific glycosylation analysis of apoB100 was carried out using reversed-phase high-performance liquid chromatography coupled with electrospray ionization tandem mass spectrometry (LC/ESI MS/MS). ApoB100 was reduced, carboxymethylated, and then digested by trypsin or chymotrypsin. The complex mixture of peptides and glycopeptides was subjected to LC/ESI MS/MS, where product ion spectra of the molecular ions were acquired data-dependently. The glycopeptide ions were extracted and confirmed by the presence of carbohydrate-specific fragment ions, such as m/z 204 (HexNAc) and 366 (HexHexNAc), in the product ion spectra. The peptide moiety of glycopeptide was determined by the presence of the b- and y-series ions derived from its amino acid sequence in the product ion spectrum, and the oligosaccharide moiety was deduced from the calculated molecular mass of the oligosaccharide. The heterogeneity of carbohydrate structures at 17 glycosylation sites was determined using this methodology. Our data showed that Asn2212, not previously identified as a site of glycosylation, could be glycosylated. It was also revealed that Asn158, 1341, 1350, 3309, and 3331 were occupied by high-mannose type oligosaccharides, and Asn 956, 1496, 2212, 2752, 2955, 3074, 3197, 3438, 3868, 4210, and 4404 were predominantly occupied by mono- or disialylated oligosaccharides. Asn3384, the nearest N-glycosylation site to the LDL-receptor binding site (amino acids 3359-3369), was occupied by a variety of oligosaccharides, including high-mannose, hybrid, and complex types. These results are useful for understanding the structure of LDL particles and oligosaccharide function in LDL-receptor ligand binding.  相似文献   

16.
Oligosaccharides were analyzed by a combination of high-performance liquid chromatography (HPLC) and mass spectrometry (MS). First, oligosaccharides labeled with 2-aminopyridine were studied to see if they could be analyzed by MS under the conditions used for separation by HPLC. Pyridylamino (PA)-oligosaccharides could be analyzed under these conditions, although the mass spectra were affected. Then, liquid chromatography-mass spectrometry was used to analyze a PA-oligosaccharide mixture derived from human immunoglobulin G. The PA-oligosaccharides were separated on a reversed-phase column and mass-analyzed directly. The observed molecular weights were close to or identical to those expected from the structures, which were estimated from the elution position on HPLC. This method is rapid and simple, as the mass spectrometer can give the accurate molecular weight of each PA-oligosaccharide in one chromatography run, even if the HPLC separation is incomplete. This method can be used to extend the so-called two-dimensional mapping of PA-oligosaccharides. The structure can be studied in greater detail by tandem MS.  相似文献   

17.
Microbial natural products constitute a wide variety of chemical compounds, many which can have antibiotic, antiviral, or anticancer properties that make them interesting for clinical purposes. Natural product classes include polyketides (PKs), nonribosomal peptides (NRPs), and ribosomally synthesized and post-translationally modified peptides (RiPPs). While variants of biosynthetic gene clusters (BGCs) for known classes of natural products are easy to identify in genome sequences, BGCs for new compound classes escape attention. In particular, evidence is accumulating that for RiPPs, subclasses known thus far may only represent the tip of an iceberg. Here, we present decRiPPter (Data-driven Exploratory Class-independent RiPP TrackER), a RiPP genome mining algorithm aimed at the discovery of novel RiPP classes. DecRiPPter combines a Support Vector Machine (SVM) that identifies candidate RiPP precursors with pan-genomic analyses to identify which of these are encoded within operon-like structures that are part of the accessory genome of a genus. Subsequently, it prioritizes such regions based on the presence of new enzymology and based on patterns of gene cluster and precursor peptide conservation across species. We then applied decRiPPter to mine 1,295 Streptomyces genomes, which led to the identification of 42 new candidate RiPP families that could not be found by existing programs. One of these was studied further and elucidated as a representative of a novel subfamily of lanthipeptides, which we designate class V. The 2D structure of the new RiPP, which we name pristinin A3 (1), was solved using nuclear magnetic resonance (NMR), tandem mass spectrometry (MS/MS) data, and chemical labeling. Two previously unidentified modifying enzymes are proposed to create the hallmark lanthionine bridges. Taken together, our work highlights how novel natural product families can be discovered by methods going beyond sequence similarity searches to integrate multiple pathway discovery criteria.

This study shows that decRiPPter, an innovative algorithmic approach using pan-genomics and machine learning, can discover novel types of ribosomally synthesized peptide (RIPP) natural products, including a new class of lanthipeptides.  相似文献   

18.
The high selectivity and throughput of tandem mass spectrometry allow for rapid identification and localization of various posttranslational protein modifications from complex mixtures by shotgun approaches. Although sequence database search algorithms provide necessary support to process the potentially enormous quantity of MS/MS spectra generated from large scale tandem mass spectrometry experiments, false positive identifications of peptide modifications may exist even after implementation of stringent identification criteria. In this report, we describe factors that lead to misinterpretation of MS/MS spectra as well as common chemical and experimental artifacts that generate false positives using the proteomics-based identification of tyrosine nitration as an example. In addition to the proposed manual validation criteria, the importance of peptide synthesis and subsequent MS/MS characterization for validation of peptide nitration demonstrated by several examples from earlier publications is also presented.  相似文献   

19.
电喷雾串联质谱图的叠合与多肽序列分析   总被引:11,自引:1,他引:10  
利用离子阱电喷雾串联质谱仪,在选择性改变某些食品参数的条件下对模式分子Met-脑啡肽和自行固相化学合成的7肽及其修饰产物、10肽和20肽进行碎裂处理,从而获得一系列具有一定差异的串联质谱图。选择具有适当互补性的图谱进行叠合处理,得到具有连贯性“三联套”(triplet)及“二联套”(doublet)碎片离子峰的叠合串联质谱图,据此可以方便准确地角析出多肽的氨基酸序列。实验结果表明,这种方法在多肽的质谱法测定中具有一定的实用性。  相似文献   

20.
【背景】海洋来源的天然产物近年来已成为小分子药物的重要来源。对海洋链霉菌Streptomyces sp. B9173的基因组分析显示,该菌包含多种天然产物的生物合成基因簇,具有产生多种新化合物的潜力。【目的】挖掘B9173菌株中未知的次级代谢产物,以期发现结构新颖或生物活性独特的化合物。【方法】利用HPLC/LC-MS结合的方法,排除了该菌株产生的已知化合物,确定3个未知化合物作为挖掘对象,然后利用正、反相硅胶柱色谱、葡聚糖凝胶柱色谱和高效液相色谱等技术对次级代谢产物进行分离纯化,最后得到化合物单体。利用质谱及核磁共振光谱技术对化合物结构进行解析和鉴定。【结果】确定3个化合物分别是色胺酮、甲基异靛蓝和N,N-二甲基异靛蓝,三者都属于2-吲哚酮生物碱。其中色胺酮具有非常广的生物活性,包括抗菌、抗肿瘤、抗炎症等,是药物开发的良好前体,这是首次在细菌中被分离得到。甲基异靛蓝是我国临床治疗慢性粒细胞白血病的药物,这是首次在微生物发酵液中被分离得到。目前这3个化合物均主要依赖化学合成。本研究结合B9173菌株的代谢背景,推测了3个化合物的生物合成途径。【结论】基于紫外吸收光谱和质谱特征,从B9173菌株的发酵液中分离鉴定了3个2-吲哚酮生物碱,丰富了微生物活性天然产物的种类,对3个化合物生物合成途径的推测也为进一步研究色胺酮和甲基异靛蓝的生物合成机制奠定基础,后续可利用合成生物学技术重构这类化合物的生物合成途径,提供更便捷、低成本的生物合成方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号