共查询到20条相似文献,搜索用时 15 毫秒
1.
Korey J. Brownstein Mahmoud Gargouri William R. Folk David R. Gang 《Metabolomics : Official journal of the Metabolomic Society》2017,13(11):133
Introduction
Botanicals containing iridoid and phenylethanoid/phenylpropanoid glycosides are used worldwide for the treatment of inflammatory musculoskeletal conditions that are primary causes of human years lived with disability, such as arthritis and lower back pain.Objectives
We report the analysis of candidate anti-inflammatory metabolites of several endemic Scrophularia species and Verbascum thapsus used medicinally by peoples of North America.Methods
Leaves, stems, and roots were analyzed by ultra-performance liquid chromatography-mass spectrometry (UPLC-MS) and partial least squares-discriminant analysis (PLS-DA) was performed in MetaboAnalyst 3.0 after processing the datasets in Progenesis QI.Results
Comparison of the datasets revealed significant and differential accumulation of iridoid and phenylethanoid/phenylpropanoid glycosides in the tissues of the endemic Scrophularia species and Verbascum thapsus.Conclusions
Our investigation identified several species of pharmacological interest as good sources for harpagoside and other important anti-inflammatory metabolites.2.
Sonia Liggi Christine Hinz Zoe Hall Maria Laura Santoru Simone Poddighe John Fjeldsted Luigi Atzori Julian L. Griffin 《Metabolomics : Official journal of the Metabolomic Society》2018,14(4):52
Introduction
Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.Objectives
Merge in the same platform the steps required for metabolomics data processing.Methods
KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.Results
The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.Conclusion
KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.3.
Tie-juan Shao Zhi-xing He Zhi-jun Xie Hai-chang Li Mei-jiao Wang Cheng-ping Wen 《Metabolomics : Official journal of the Metabolomic Society》2016,12(4):70
Introduction
The differences in fecal metabolome between ankylosing spondylitis (AS)/rheumatoid arthritis (RA) patients and healthy individuals could be the reason for an autoimmune disorder.Objectives
The study explored the fecal metabolome difference between AS/RA patients and healthy controls to clarify human immune disturbance.Methods
Fecal samples from 109 individuals (healthy controls 34, AS 40, and RA 35) were analyzed by 1H NMR spectroscopy. Data were analyzed with principal component analysis (PCA) and orthogonal projection to latent structure discriminant (OPLS-DA) analysis.Results
Significant differences in the fecal metabolic profiles could distinguish AS/RA patients from healthy controls but could not distinguish between AS and RA patients. The significantly decreased metabolites in AS/RA patients were butyrate, propionate, methionine, and hypoxanthine. Significantly increased metabolites in AS/RA patients were taurine, methanol, fumarate, and tryptophan.Conclusion
The metabolome variations in feces indicated AS and RA were two homologous diseases that could not be distinguished by 1H NMR metabolomics.4.
Normalization and integration of large-scale metabolomics data using support vector regression 总被引:1,自引:0,他引:1
Xiaotao Shen Xiaoyun Gong Yuping Cai Yuan Guo Jia Tu Hao Li Tao Zhang Jialin Wang Fuzhong Xue Zheng-Jiang Zhu 《Metabolomics : Official journal of the Metabolomic Society》2016,12(5):89
Introduction
Untargeted metabolomics studies for biomarker discovery often have hundreds to thousands of human samples. Data acquisition of large-scale samples has to be divided into several batches and may span from months to as long as several years. The signal drift of metabolites during data acquisition (intra- and inter-batch) is unavoidable and is a major confounding factor for large-scale metabolomics studies.Objectives
We aim to develop a data normalization method to reduce unwanted variations and integrate multiple batches in large-scale metabolomics studies prior to statistical analyses.Methods
We developed a machine learning algorithm-based method, support vector regression (SVR), for large-scale metabolomics data normalization and integration. An R package named MetNormalizer was developed and provided for data processing using SVR normalization.Results
After SVR normalization, the portion of metabolite ion peaks with relative standard deviations (RSDs) less than 30 % increased to more than 90 % of the total peaks, which is much better than other common normalization methods. The reduction of unwanted analytical variations helps to improve the performance of multivariate statistical analyses, both unsupervised and supervised, in terms of classification and prediction accuracy so that subtle metabolic changes in epidemiological studies can be detected.Conclusion
SVR normalization can effectively remove the unwanted intra- and inter-batch variations, and is much better than other common normalization methods.5.
Zhanglong Ji Xiaoqian Jiang Shuang Wang Li Xiong Lucila Ohno-Machado 《BMC medical genomics》2014,7(Z1):S14
Background
Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced.Methodology
In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data.Experiments and results
We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios.Conclusion
Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee.6.
Justin Y. Lee Mark P. Styczynski 《Metabolomics : Official journal of the Metabolomic Society》2018,14(12):153
Introduction
A common problem in metabolomics data analysis is the existence of a substantial number of missing values, which can complicate, bias, or even prevent certain downstream analyses. One of the most widely-used solutions to this problem is imputation of missing values using a k-nearest neighbors (kNN) algorithm to estimate missing metabolite abundances. kNN implicitly assumes that missing values are uniformly distributed at random in the dataset, but this is typically not true in metabolomics, where many values are missing because they are below the limit of detection of the analytical instrumentation.Objectives
Here, we explore the impact of nonuniformly distributed missing values (missing not at random, or MNAR) on imputation performance. We present a new model for generating synthetic missing data and a new algorithm, No-Skip kNN (NS-kNN), that accounts for MNAR values to provide more accurate imputations.Methods
We compare the imputation errors of the original kNN algorithm using two distance metrics, NS-kNN, and a recently developed algorithm KNN-TN, when applied to multiple experimental datasets with different types and levels of missing data.Results
Our results show that NS-kNN typically outperforms kNN when at least 20–30% of missing values in a dataset are MNAR. NS-kNN also has lower imputation errors than KNN-TN on realistic datasets when at least 50% of missing values are MNAR.Conclusion
Accounting for the nonuniform distribution of missing values in metabolomics data can significantly improve the results of imputation algorithms. The NS-kNN method imputes missing metabolomics data more accurately than existing kNN-based approaches when used on realistic datasets.7.
Johannes Hertel Sandra Van der Auwera Nele Friedrich Katharina Wittfeld Maik Pietzner Kathrin Budde Alexander Teumer Thomas Kocher Matthias Nauck Hans Jörgen Grabe 《Metabolomics : Official journal of the Metabolomic Society》2017,13(4):42
Introduction
Different normalization methods are available for urinary data. However, it is unclear which method performs best in minimizing error variance on a certain data-set as no generally applicable empirical criteria have been established so far.Objectives
The main aim of this study was to develop an applicable and formally correct algorithm to decide on the normalization method without using phenotypic information.Methods
We proved mathematically for two classical measurement error models that the optimal normalization method generates the highest correlation between the normalized urinary metabolite concentrations and its blood concentrations or, respectively, its raw urinary concentrations. We then applied the two criteria to the urinary 1H-NMR measured metabolomic data from the Study of Health in Pomerania (SHIP-0; n?=?4068) under different normalization approaches and compared the results with in silico experiments to explore the effects of inflated error variance in the dilution estimation.Results
In SHIP-0, we demonstrated consistently that probabilistic quotient normalization based on aligned spectra outperforms all other tested normalization methods. Creatinine normalization performed worst, while for unaligned data integral normalization seemed to most reasonable. The simulated and the actual data were in line with the theoretical modeling, underlining the general validity of the proposed criteria.Conclusions
The problem of choosing the best normalization procedure for a certain data-set can be solved empirically. Thus, we recommend applying different normalization procedures to the data and comparing their performances via the statistical methodology explicated in this work. On the basis of classical measurement error models, the proposed algorithm will find the optimal normalization method.8.
Nicholas J. Bond Albert Koulman Julian L. Griffin Zoe Hall 《Metabolomics : Official journal of the Metabolomic Society》2017,13(11):128
Introduction
Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools.Objectives
We have developed massPix—an R package for analysing and interpreting data from MSI of lipids in tissue.Methods
massPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries.Results
Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering.Conclusion
massPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications.9.
Anna Lindahl Rainer Heuchel Jenny Forshed Janne Lehtiö Matthias Löhr Anders Nordström 《Metabolomics : Official journal of the Metabolomic Society》2017,13(5):61
Introduction
Pancreatic ductal adenocarcinoma (PDAC) is the fifth most common cause of cancer-related death in Europe with a 5-year survival rate of <5%. Chronic pancreatitis (CP) is a risk factor for PDAC development, but in the majority of cases malignancy is discovered too late for curative treatment. There is at present no reliable diagnostic marker for PDAC available.Objectives
The aim of the study was to identify single blood-based metabolites or a panel of metabolites discriminating PDAC and CP using liquid chromatography-mass spectrometry (LC-MS).Methods
A discovery cohort comprising PDAC (n?=?44) and CP (n?=?23) samples was analyzed by LC-MS followed by univariate (Student’s t test) and multivariate (orthogonal partial least squares-discriminant analysis (OPLS-DA)) statistics. Discriminative metabolite features were subject to raw data examination and identification to ensure high feature quality. Their discriminatory power was then confirmed in an independent validation cohort including PDAC (n?=?20) and CP (n?=?31) samples.Results
Glycocholic acid, N-palmitoyl glutamic acid and hexanoylcarnitine were identified as single markers discriminating PDAC and CP by univariate analysis. OPLS-DA resulted in a panel of five metabolites including the aforementioned three metabolites as well as phenylacetylglutamine (PAGN) and chenodeoxyglycocholate.Conclusion
Using LC-MS-based metabolomics we identified three single metabolites and a five-metabolite panel discriminating PDAC and CP in two independent cohorts. Although further study is needed in larger cohorts, the metabolites identified are potentially of use in PDAC diagnostics.10.
Background
Human cancers are complex ecosystems composed of cells with distinct molecular signatures. Such intratumoral heterogeneity poses a major challenge to cancer diagnosis and treatment. Recent advancements of single-cell techniques such as scRNA-seq have brought unprecedented insights into cellular heterogeneity. Subsequently, a challenging computational problem is to cluster high dimensional noisy datasets with substantially fewer cells than the number of genes.Methods
In this paper, we introduced a consensus clustering framework conCluster, for cancer subtype identification from single-cell RNA-seq data. Using an ensemble strategy, conCluster fuses multiple basic partitions to consensus clusters.Results
Applied to real cancer scRNA-seq datasets, conCluster can more accurately detect cancer subtypes than the widely used scRNA-seq clustering methods. Further, we conducted co-expression network analysis for the identified melanoma subtypes.Conclusions
Our analysis demonstrates that these subtypes exhibit distinct gene co-expression networks and significant gene sets with different functional enrichment.11.
Caroline Muschet Gabriele Möller Cornelia Prehn Martin Hrabě de Angelis Jerzy Adamski Janina Tokarz 《Metabolomics : Official journal of the Metabolomic Society》2016,12(10):151
Introduction
Although cultured cells are nowadays regularly analyzed by metabolomics technologies, some issues in study setup and data processing are still not resolved to complete satisfaction: a suitable harvesting method for adherent cells, a fast and robust method for data normalization, and the proof that metabolite levels can be normalized to cell number.Objectives
We intended to develop a fast method for normalization of cell culture metabolomics samples, to analyze how metabolite levels correlate with cell numbers, and to elucidate the impact of the kind of harvesting on measured metabolite profiles.Methods
We cultured four different human cell lines and used them to develop a fluorescence-based method for DNA quantification. Further, we assessed the correlation between metabolite levels and cell numbers and focused on the impact of the harvesting method (scraping or trypsinization) on the metabolite profile.Results
We developed a fast, sensitive and robust fluorescence-based method for DNA quantification showing excellent linear correlation between fluorescence intensities and cell numbers for all cell lines. Furthermore, 82–97 % of the measured intracellular metabolites displayed linear correlation between metabolite concentrations and cell numbers. We observed differences in amino acids, biogenic amines, and lipid levels between trypsinized and scraped cells.Conclusion
We offer a fast, robust, and validated normalization method for cell culture metabolomics samples and demonstrate the eligibility of the normalization of metabolomics data to the cell number. We show a cell line and metabolite-specific impact of the harvesting method on metabolite concentrations.12.
Background
High-throughput technologies, such as DNA microarray, have significantly advanced biological and biomedical research by enabling researchers to carry out genome-wide screens. One critical task in analyzing genome-wide datasets is to control the false discovery rate (FDR) so that the proportion of false positive features among those called significant is restrained. Recently a number of FDR control methods have been proposed and widely practiced, such as the Benjamini-Hochberg approach, the Storey approach and Significant Analysis of Microarrays (SAM).Methods
This paper presents a straight-forward yet powerful FDR control method termed miFDR, which aims to minimize FDR when calling a fixed number of significant features. We theoretically proved that the strategy used by miFDR is able to find the optimal number of significant features when the desired FDR is fixed.Results
We compared miFDR with the BH approach, the Storey approach and SAM on both simulated datasets and public DNA microarray datasets. The results demonstrated that miFDR outperforms others by identifying more significant features under the same FDR cut-offs. Literature search showed that many genes called only by miFDR are indeed relevant to the underlying biology of interest.Conclusions
FDR has been widely applied to analyzing high-throughput datasets allowed for rapid discoveries. Under the same FDR threshold, miFDR is capable to identify more significant features than its competitors at a compatible level of complexity. Therefore, it can potentially generate great impacts on biological and biomedical research.Availability
If interested, please contact the authors for getting miFDR.13.
Daniel Morvan Aicha Demidem 《Metabolomics : Official journal of the Metabolomic Society》2018,14(5):55
Introduction
Elucidating molecular alterations due to mitochondrial Complex I (CI) mutations may help to understand CI deficiency (CID), not only in mitochondriopathies but also as it is caused by drugs or associated to many diseases.Objectives
CID metabolic expression was investigated in Leber’s hereditary optic neuropathy (LHON) caused by an inherited mutation of CI.Methods
NMR-based metabolomics analysis was performed in intact skin fibroblasts from LHON patients. It used several datasets: one-dimensional 1H-NMR spectra, two-dimensional 1H-NMR spectra and quantified metabolites. Spectra were analysed using orthogonal partial least squares-discriminant analysis (OPLS-DA), and quantified metabolites using univariate statistics. The response to idebenone (IDE) and resveratrol (RSV), two agents improving CI activity and mitochondrial functions was evaluated.Results
LHON fibroblasts had decreased CI activity (??43%, p?<?0.01). Metabolomics revealed prominent alterations in LHON including the increase of fatty acids (FA), polyunsaturated FA and phosphatidylcholine with a variable importance in the prediction (VIP)?>?1 in OPLS-DA, p?<?0.01 in univariate statistics, and the decrease of amino acids (AA), predominantly glycine, glutamate, glutamine (VIP?>?1) and alanine (VIP?>?1, p?<?0.05). In LHON, treatment with IDE and RSV increased CI activity (+?40 and +?44%, p?<?0.05). IDE decreased FA, polyunsaturated FA and phosphatidylcholine (p?<?0.05), but did not modified AA levels. RSV decreased polyunsaturated FA, and increased several AA (VIP?>?1 and/or p?<?0.05).Conclusion
LHON fibroblasts display lipid and amino acid metabolism alterations that are reversed by mitochondria-targeted treatments, and can be related to adaptive changes. Findings bring insights into molecular changes induced by CI mutation and, beyond, CID of other origins.14.
Daniel Cañueto Josep Gómez Reza M. Salek Xavier Correig Nicolau Cañellas 《Metabolomics : Official journal of the Metabolomic Society》2018,14(3):24
Introduction
Adoption of automatic profiling tools for 1H-NMR-based metabolomic studies still lags behind other approaches in the absence of the flexibility and interactivity necessary to adapt to the properties of study data sets of complex matrices.Objectives
To provide an open source tool that fully integrates these needs and enables the reproducibility of the profiling process.Methods
rDolphin incorporates novel techniques to optimize exploratory analysis, metabolite identification, and validation of profiling output quality.Results
The information and quality achieved in two public datasets of complex matrices are maximized.Conclusion
rDolphin is an open-source R package (http://github.com/danielcanueto/rDolphin) able to provide the best balance between accuracy, reproducibility and ease of use.15.
Wesley W. Ingwersen Ezra Kahn Joyce Cooper 《The International Journal of Life Cycle Assessment》2018,23(11):2266-2270
Introduction
New platforms are emerging that enable more data providers to publish life cycle inventory data.Background
Providing datasets that are not complete LCA models results in fragments that are difficult for practitioners to integrate and use for LCA modeling. Additionally, when proxies are used to provide a technosphere input to a process that was not originally intended by the process authors, in most LCA software, this requires modifying the original process.Results
The use of a bridge process, which is a process created to link two existing processes, is proposed as a solution.Discussion
Benefits to bridge processes include increasing model transparency, facilitating dataset sharing and integration without compromising original dataset integrity and independence, providing a structure with which to make the data quality associated with process linkages explicit, and increasing model flexibility in the case that multiple bridges are provided. A drawback is that they add additional processes to existing LCA models which will increase their size.Conclusions
Bridge processes can be an enabler in allowing users to integrate new datasets without modifying them to link to background databases or other processes they have available. They may not be the ideal long-term solution but provide a solution that works within the existing LCA data model.16.
Background
The DNase I hypersensitive sites (DHSs) are associated with the cis-regulatory DNA elements. An efficient method of identifying DHSs can enhance the understanding on the accessibility of chromatin. Despite a multitude of resources available on line including experimental datasets and computational tools, the complex language of DHSs remains incompletely understood.Methods
Here, we address this challenge using an approach based on a state-of-the-art machine learning method. We present a novel convolutional neural network (CNN) which combined Inception like networks with a gating mechanism for the response of multiple patterns and longterm association in DNA sequences to predict multi-scale DHSs in Arabidopsis, rice and Homo sapiens.Results
Our method obtains 0.961 area under curve (AUC) on Arabidopsis, 0.969 AUC on rice and 0.918 AUC on Homo sapiens.Conclusions
Our method provides an efficient and accurate way to identify multi-scale DHSs sequences by deep learning.17.
N. Cesbron A.-L. Royer Y. Guitton A. Sydor B. Le Bizec G. Dervilly-Pinel 《Metabolomics : Official journal of the Metabolomic Society》2017,13(8):99
Introduction
Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.Objectives
In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.Methods
The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.Results
A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.Conclusion
The workflow generated repeatable and informative fingerprints for robust metabolome characterization.18.
Da-Yong Hu Ying Luo Chang-Bin Li Chun-Yu Zhou Xin-Hua Li Ai Peng Jun-Yan Liu 《Metabolomics : Official journal of the Metabolomic Society》2018,14(8):104
Introduction
Nearly all the enzymes that mediate the metabolism of polyunsaturated fatty acids (PUFAs) are present in the kidney. However, the correlation of renal dysfunction with PUFAs metabolism in uremic patients remains unknown.Objectives
To test whether the alterations in the metabolism of PUFAs reflect the renal dysfunction in uremic patients.Methods
LC–MS/MS-based oxylipin profiling was conducted for the plasma samples from the uremic patients and controls. The data were analyzed by principal component analysis (PCA) and orthogonal partial least squares-discriminant analysis (OPLS-DA). The receiver operating characteristic (ROC) curves and the correlation of the estimated glomerular filtration rate (eGFR) with the key markers were evaluated. Furthermore, qPCR analysis of the whole blood cells was conducted to investigate the possible mechanisms. In addition, a 2nd cohort was used to validate the findings from the 1st cohort.Results
The plasma oxylipin profile distinguished the uremic patients from the controls successfully by using both PCA and OPLS-DA models. 5,6-Dihydroxyeicosatrienoic acid (5,6-DHET), 5-hydroxyeicosatetraenoic acid (5-HETE), 9(10)-epoxyoctadecamonoenoic acid [9(10)-EpOME] and 12(13)-EpOME were identified as the key markers to discriminate the patients from controls. The excellent predictive performance of these four markers was validated by ROC analysis. The eGFR significantly correlated with plasma levels of 5,6-DHET and 5-HETE positively but with plasma 9(10)-EpOME and 12(13)-EpOME negatively. The changes of these markers may account for the inactivation of cytochrome P450 2C18, 2C19, microsome epoxide hydrolase (EPHX1), and 5-lipoxygenase in the patients.Conclusion
The alterations in plasma metabolic profile reflect the renal dysfunction in the uremic patients.19.
Rachel A. Spicer Christoph Steinbeck 《Metabolomics : Official journal of the Metabolomic Society》2018,14(1):16