共查询到20条相似文献,搜索用时 15 毫秒
1.
In metabolomics, the rapid identification of quantitative differences between multiple biological samples remains a major
challenge. While capillary electrophoresis–mass spectrometry (CE–MS) is a powerful tool to simultaneously quantify charged
metabolites, reliable and easy-to-use software that is well suited to analyze CE–MS metabolic profiles is still lacking. Optimized
software tools for CE–MS are needed because of the sometimes large variation in migration time between runs and the wider
variety of peak shapes in CE–MS data compared with LC–MS or GC–MS. Therefore, we implemented a stand-alone application named
JDAMP (Java application for Differential Analysis of Metabolite Profiles), which allows users to identify the metabolites that
vary between two groups. The main features include fast calculation modules and a file converter using an original compact
file format, baseline subtraction, dataset normalization and alignment, visualization on 2D plots ( m/z and time axis) with matching metabolite standards, and the detection of significant differences between metabolite profiles.
Moreover, it features an easy-to-use graphical user interface that requires only a few mouse-actions to complete the analysis.
The interface also enables the analyst to evaluate the semiautomatic processes and interactively tune options and parameters
depending on the input datasets. The confirmation of findings is available as a list of overlaid electropherograms, which
is ranked using a novel difference-evaluation function that accounts for peak size and distortion as well as statistical criteria
for accurate difference-detection. Overall, the JDAMP software complements other metabolomics data processing tools and permits easy and rapid detection of significant differences
between multiple complex CE–MS profiles. 相似文献
5.
The emerging field of metabolomics, aiming to characterize small molecule metabolites present in biological systems, promises immense potential for different areas such as medicine, environmental sciences, agronomy, etc. The purpose of this article is to guide the reader through the history of the field, then through the main steps of the metabolomics workflow, from study design to structure elucidation, and help the reader to understand the key phases of a metabolomics investigation and the rationale underlying the protocols and techniques used. This article is not intended to give standard operating procedures as several papers related to this topic were already provided, but is designed as a tutorial aiming to help beginners understand the concept and challenges of MS‐based metabolomics. A real case example is taken from the literature to illustrate the application of the metabolomics approach in the field of doping analysis. Challenges and limitations of the approach are then discussed along with future directions in research to cope with these limitations. This tutorial is part of the International Proteomics Tutorial Programme (IPTP18). 相似文献
6.
Background The majority of ovarian cancer biomarker discovery efforts focus on the identification of proteins that can improve the predictive power of presently available diagnostic tests. We here show that metabolomics, the study of metabolic changes in biological systems, can also provide characteristic small molecule fingerprints related to this disease. Results In this work, new approaches to automatic classification of metabolomic data produced from sera of ovarian cancer patients and benign controls are investigated. The performance of support vector machines (SVM) for the classification of liquid chromatography/time-of-flight mass spectrometry (LC/TOF MS) metabolomic data focusing on recognizing combinations or "panels" of potential metabolic diagnostic biomarkers was evaluated. Utilizing LC/TOF MS, sera from 37 ovarian cancer patients and 35 benign controls were studied. Optimum panels of spectral features observed in positive or/and negative ion mode electrospray (ESI) MS with the ability to distinguish between control and ovarian cancer samples were selected using state-of-the-art feature selection methods such as recursive feature elimination and L1-norm SVM. Conclusion Three evaluation processes (leave-one-out-cross-validation, 12-fold-cross-validation, 52-20-split-validation) were used to examine the SVM models based on the selected panels in terms of their ability for differentiating control vs. disease serum samples. The statistical significance for these feature selection results were comprehensively investigated. Classification of the serum sample test set was over 90% accurate indicating promise that the above approach may lead to the development of an accurate and reliable metabolomic-based approach for detecting ovarian cancer. 相似文献
7.
Background Data generated from liquid chromatography coupled to high-resolution mass spectrometry (LC-MS)-based studies of a biological
sample can contain large amounts of biologically significant information in the form of proteins, peptides, and metabolites.
Interpreting this data involves inferring the masses and abundances of biomolecules injected into the instrument. Because
of the inherent complexity of mass spectral patterns produced by these biomolecules, the analysis is significantly enhanced
by using visualization capabilities to inspect and confirm results. In this paper we describe Decon2LS, an open-source software
package for automated processing and visualization of high-resolution MS data. Drawing extensively on algorithms developed
over the last ten years for ICR2LS, Decon2LS packages the algorithms as a rich set of modular, reusable processing classes
for performing diverse functions such as reading raw data, routine peak finding, theoretical isotope distribution modelling,
and deisotoping. Because the source code is openly available, these functionalities can now be used to build derivative applications
in relatively fast manner. In addition, Decon2LS provides an extensive set of visualization tools, such as high performance
chart controls. 相似文献
8.
Background One of the goals of global metabolomic analysis is to identify metabolic markers that are hidden within a large background
of data originating from high-throughput analytical measurements. Metabolite-based clustering is an unsupervised approach
for marker identification based on grouping similar concentration profiles of putative metabolites. A major problem of this
approach is that in general there is no prior information about an adequate number of clusters. 相似文献
9.
MOTIVATION: Metabolomics datasets are generally large and complex. Using principal component analysis (PCA), a simplified view of the variation in the data is obtained. The PCA model can be interpreted and the processes underlying the variation in the data can be analysed. In metabolomics, often a priori information is present about the data. Various forms of this information can be used in an unsupervised data analysis with weighted PCA (WPCA). A WPCA model will give a view on the data that is different from the view obtained using PCA, and it will add to the interpretation of the information in a metabolomics dataset. RESULTS: A method is presented to translate spectra of repeated measurements into weights describing the experimental error. These weights are used in the data analysis with WPCA. The WPCA model will give a view on the data where the non-uniform experimental error is accounted for. Therefore, the WPCA model will focus more on the natural variation in the data. AVAILABILITY: M-files for MATLAB for the algorithm used in this research are available at http://www-its.chem.uva.nl/research/pac/Software/pcaw.zip. 相似文献
10.
IntroductionMultilevel modeling is a quantitative statistical method to investigate variability and relationships between variables of interest, taking into account population structure and dependencies. It can be used for prediction, data reduction and causal inference from experiments and observational studies allowing for more efficient elucidation of knowledge.ObjectivesIn this study we introduced the concept of multilevel pharmacokinetics (PK)-driven modelling for large-sample, unbalanced and unadjusted metabolomics data comprising nucleoside and creatinine concentration measurements in urine of healthy and cancer patients.MethodsA Bayesian multilevel model was proposed to describe the nucleoside and creatinine concentration ratio considering age, sex and health status as covariates. The predictive performance of the proposed model was summarized via area under the ROC, sensitivity and specificity using external validation.ResultsCancer was associated with an increase in methylthioadenosine/creatinine excretion rate by a factor of 1.42 (1.09–2.03) which constituted the highest increase among all nucleosides. Age influenced nucleosides/creatinine excretion rates for all nucleosides in the same direction which was likely caused by a decrease in creatinine clearance with age. There was a small evidence of sex-related differences for methylthioadenosine. The individual a posteriori prediction of patient classification as area under the ROC with 5th and 95th percentile was 0.57(0.5–0.67) with sensitivity and specificity of 0.59(0.42–0.76) and 0.57(0.45–0.7), respectively suggesting limited usefulness of 13 nucleosides/creatinine urine concentration measurements in predicting disease in this population.ConclusionBayesian multilevel pharmacokinetics-driven modeling in metabolomics may be useful in understanding the data and may constitute a new tool for searching towards potential candidates of disease indicators. 相似文献
11.
SUMMARY: New additional methods are presented for processing and visualizing mass spectrometry based molecular profile data, implemented as part of the recently introduced MZmine software. They include new features and extensions such as support for mzXML data format, capability to perform batch processing for large number of files, support for parallel processing, new methods for calculating peak areas using post-alignment peak picking algorithm and implementation of Sammon's mapping and curvilinear distance analysis for data visualization and exploratory analysis. AVAILABILITY: MZmine is available under GNU Public license from http://mzmine.sourceforge.net/. 相似文献
12.
Missing values in mass spectrometry metabolomic datasets occur widely and can originate from a number of sources, including for both technical and biological reasons. Currently, little is known about these data, i.e. about their distributions across datasets, the need (or not) to consider them in the data processing pipeline, and most importantly, the optimal way of assigning them values prior to univariate or multivariate data analysis. Here, we address all of these issues using direct infusion Fourier transform ion cyclotron resonance mass spectrometry data. We have shown that missing data are widespread, accounting for ca. 20% of data and affecting up to 80% of all variables, and that they do not occur randomly but rather as a function of signal intensity and mass-to-charge ratio. We have demonstrated that missing data estimation algorithms have a major effect on the outcome of data analysis when comparing the differences between biological sample groups, including by t test, ANOVA and principal component analysis. Furthermore, results varied significantly across the eight algorithms that we assessed for their ability to impute known, but labelled as missing, entries. Based on all of our findings we identified the k-nearest neighbour imputation method ( KNN) as the optimal missing value estimation approach for our direct infusion mass spectrometry datasets. However, we believe the wider significance of this study is that it highlights the importance of missing metabolite levels in the data processing pipeline and offers an approach to identify optimal ways of treating missing data in metabolomics experiments. 相似文献
14.
Missing values in mass spectrometry metabolomic datasets occur widely and can originate from a number of sources, including for both technical and biological reasons. Currently, little is known about these data, i.e. about their distributions across datasets, the need (or not) to consider them in the data processing pipeline, and most importantly, the optimal way of assigning them values prior to univariate or multivariate data analysis. Here, we address all of these issues using direct infusion Fourier transform ion cyclotron resonance mass spectrometry data. We have shown that missing data are widespread, accounting for ca. 20% of data and affecting up to 80% of all variables, and that they do not occur randomly but rather as a function of signal intensity and mass-to-charge ratio. We have demonstrated that missing data estimation algorithms have a major effect on the outcome of data analysis when comparing the differences between biological sample groups, including by t test, ANOVA and principal component analysis. Furthermore, results varied significantly across the eight algorithms that we assessed for their ability to impute known, but labelled as missing, entries. Based on all of our findings we identified the k-nearest neighbour imputation method (KNN) as the optimal missing value estimation approach for our direct infusion mass spectrometry datasets. However, we believe the wider significance of this study is that it highlights the importance of missing metabolite levels in the data processing pipeline and offers an approach to identify optimal ways of treating missing data in metabolomics experiments. 相似文献
15.
Polycystic ovary syndrome (PCOS) is a set of symptoms caused by elevated androgens (male hormones) in females. PCOS is the most common endocrine disorder among women between 18 and 44 years. Currently, the pathogenesis of PCOS remains unclear. Liquid chromatography–mass spectrometry (LC/MS)‐based metabolomics is becoming more and more useful for medical research, especially in revealing the mechanism of the disease. The aim of this study was to investigate the difference of serum metabolic profiles in patients with PCOS and healthy control to better understand the mechanism of this disease. Ten patients with PCOS and 10 healthy people were recruited for this study. The serum samples were collected for LC/MS analysis. Multivariate statistical analysis was performed to discover and identify the potential biomarkers. Six biomarkers were found and identified. The biomarkers belonged to different metabolic pathway including lipid metabolism, carnitine metabolism, androgen metabolism, and bile acid metabolism. Those biomarkers also played different roles in disease progression. Metabolomics is a powerful tool used in research of the mechanism involved in this disease to provide useful information for better understanding of PCOS. 相似文献
16.
Metabolic serotypes sensitive to caloric intake may enable sera metabolomic profiles to validate epidemiological parameters and predict disease risk in humans. This long-range goal is complicated by the lack of known state markers and the requirement for simultaneous monitoring of multiple small changes. Therefore, analytical precision for appropriate high data density studies using HPLC separations coupled with coulometric array detectors was evaluated over a two month period in pooled rat sera samples (previously collected and stored at –80 °C), and in authentic biochemical standards. In sera, mean coefficients of variation (CV) of retention time and ratio accuracy within the established metabolic serotype varied within ±1% and ±3%, respectively. In sets of purified standards, the same parameters fluctuated, correspondently, in ranges of ±0.1% and ±1%. Median CV of the metabolite concentrations were ~13% in standards and ~11–19% in sera, and varied non-monotonically with the analytical system status and experimental design. These parameters were shown to be sufficiently controlled so as not to dominate intra-group biological variability in serum metabolomics studies. Continuation of experimental runs across an analytical breakpoint (column replacement) was associated with disproportionate changes in metabolite concentrations, independent of maintained analytical precision. These changes were sufficient to shift overall profile localization in megavariate projection analyses. We developed a mathematical approach to normalize this break and use partial least squares projection to latent structure discriminant analysis to confirm validity of this normalization approach. This generally applicable mathematical correction helps enable longer term high data density studies by removing a critical source of systemic variation. 相似文献
17.
Tannin-enriched extracts from raspberry, cloudberry and strawberry were analysed by liquid chromatography-mass spectrometric (LC-MS) techniques. The raspberry and cloudberry extracts contained a similar mixture of identifiable ellagitannin components and ellagic acid. However, the strawberry extract contained a complex mixture of ellagitannin and proanthocyanidin components that could not be adequately resolved to allow identification of individual peaks. Nevertheless, the negative ESI-MS spectra obtained by direct infusion mass spectrometric (DIMS) analysis described the diversity of these samples. For example, the predominance of signals associated with Lambertianin C in cloudberry and Sanguiin H6 in raspberry tannin extracts could be discerned and the diversity of signals from procyanidin and propelargonidin oligomers could be identified in the strawberry extract. The dose response for the main ellagitannin-derived signals in the raspberry tannin sample revealed a saturation effect probably due to ion suppression effects in the ion trap spectrometer. Nevertheless, DIMS spectra of whole berry extracts described qualitative differences in ellagitannin-derived peaks in raspberry, cloudberry and strawberry samples. In addition, positive mode DIMS spectra illustrated qualitative differences in the anthocyanin composition of berries of progeny from a raspberry breeding population that had been previously analysed by LC-MS. This suggests that DIMS could be applied to rapidly assess differences in polyphenol content, especially in large sample sets such as the progeny from breeding programmes. 相似文献
18.
Metabolic profiling is considered to be a very promising tool for diagnostic purposes, for assessing nutritional status and response to drugs. However, it is also evident that human metabolic profiles have a complex nature, influenced by many external factors. This, together with the understanding of the difficulty to assign people to distinct groups and a general move in clinical science towards personalized medicine, raises the interest to explore individual and variable metabolic features for each individual separately in longitudinal study design. In the current paper we have analyzed a set of metabolic profiles of a selection of six urine samples per person from a set of healthy individuals by (1)H NMR and reversed-phase UPLC-MS. We have demonstrated that the method for recovery of individual metabolic phenotypes can give complementary information to another established method for analysis of longitudinal data--multilevel component analysis. We also show that individual metabolic signatures can be found not only in (1)H NMR data, as has been demonstrated before, but also even more strongly in LC-MS data. 相似文献
19.
A phenomenon observed earlier in the development of metabolomics as a systems biology methodology, consists of a small but significant number of metabolites whose levels are highly correlated between biological replicates. Contrary to initial interpretations, these correlations are not necessarily only between neighboring metabolites in the metabolic network. Most metabolites that participate in common reactions are not correlated in this way, while some non-neighboring metabolites are highly correlated. Here we investigate the origin of such correlations using metabolic control analysis and computer simulation of biochemical networks. A series of cases is identified which lead to high correlation between metabolite pairs in replicate measurement. These are (1) chemical equilibrium, (2) mass conservation, (3) asymmetric control distribution, and (4) unusually high variance in the expression of a single gene. The importance of identifying metabolite correlations within a physiological state and changes of correlation between different states is discussed in the context of systems biology. 相似文献
20.
Nuclear magnetic resonance (NMR) and liquid chromatography-mass spectrometry (LCMS) are frequently used as technological platforms
for metabolomics applications. In this study, the metabolic profiles of ripe fruits from 50 different tomato cultivars, including
beef, cherry and round types, were recorded by both 1H NMR and accurate mass LC-quadrupole time-of-flight (QTOF) MS. Different analytical selectivities were found for these both
profiling techniques. In fact, NMR and LCMS provided complementary data, as the metabolites detected belong to essentially
different metabolic pathways. Yet, upon unsupervised multivariate analysis, both NMR and LCMS datasets revealed a clear segregation
of, on the one hand, the cherry tomatoes and, on the other hand, the beef and round tomatoes. Intra-method (NMR–NMR, LCMS–LCMS)
and inter-method (NMR–LCMS) correlation analyses were performed enabling the annotation of metabolites from highly correlating
metabolite signals. Signals belonging to the same metabolite or to chemically related metabolites are among the highest correlations
found. Inter-method correlation analysis produced highly informative and complementary information for the identification
of metabolites, even in de case of low abundant NMR signals. The applied approach appears to be a promising strategy in extending
the analytical capacities of these metabolomics techniques with regard to the discovery and identification of biomarkers and
yet unknown metabolites.
Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. 相似文献
|