共查询到20条相似文献,搜索用时 15 毫秒
1.
Sonia Liggi Christine Hinz Zoe Hall Maria Laura Santoru Simone Poddighe John Fjeldsted Luigi Atzori Julian L. Griffin 《Metabolomics : Official journal of the Metabolomic Society》2018,14(4):52
Introduction
Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.Objectives
Merge in the same platform the steps required for metabolomics data processing.Methods
KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.Results
The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.Conclusion
KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.2.
Normalization and integration of large-scale metabolomics data using support vector regression 总被引:1,自引:0,他引:1
Xiaotao Shen Xiaoyun Gong Yuping Cai Yuan Guo Jia Tu Hao Li Tao Zhang Jialin Wang Fuzhong Xue Zheng-Jiang Zhu 《Metabolomics : Official journal of the Metabolomic Society》2016,12(5):89
Introduction
Untargeted metabolomics studies for biomarker discovery often have hundreds to thousands of human samples. Data acquisition of large-scale samples has to be divided into several batches and may span from months to as long as several years. The signal drift of metabolites during data acquisition (intra- and inter-batch) is unavoidable and is a major confounding factor for large-scale metabolomics studies.Objectives
We aim to develop a data normalization method to reduce unwanted variations and integrate multiple batches in large-scale metabolomics studies prior to statistical analyses.Methods
We developed a machine learning algorithm-based method, support vector regression (SVR), for large-scale metabolomics data normalization and integration. An R package named MetNormalizer was developed and provided for data processing using SVR normalization.Results
After SVR normalization, the portion of metabolite ion peaks with relative standard deviations (RSDs) less than 30 % increased to more than 90 % of the total peaks, which is much better than other common normalization methods. The reduction of unwanted analytical variations helps to improve the performance of multivariate statistical analyses, both unsupervised and supervised, in terms of classification and prediction accuracy so that subtle metabolic changes in epidemiological studies can be detected.Conclusion
SVR normalization can effectively remove the unwanted intra- and inter-batch variations, and is much better than other common normalization methods.3.
Caroline Muschet Gabriele Möller Cornelia Prehn Martin Hrabě de Angelis Jerzy Adamski Janina Tokarz 《Metabolomics : Official journal of the Metabolomic Society》2016,12(10):151
Introduction
Although cultured cells are nowadays regularly analyzed by metabolomics technologies, some issues in study setup and data processing are still not resolved to complete satisfaction: a suitable harvesting method for adherent cells, a fast and robust method for data normalization, and the proof that metabolite levels can be normalized to cell number.Objectives
We intended to develop a fast method for normalization of cell culture metabolomics samples, to analyze how metabolite levels correlate with cell numbers, and to elucidate the impact of the kind of harvesting on measured metabolite profiles.Methods
We cultured four different human cell lines and used them to develop a fluorescence-based method for DNA quantification. Further, we assessed the correlation between metabolite levels and cell numbers and focused on the impact of the harvesting method (scraping or trypsinization) on the metabolite profile.Results
We developed a fast, sensitive and robust fluorescence-based method for DNA quantification showing excellent linear correlation between fluorescence intensities and cell numbers for all cell lines. Furthermore, 82–97 % of the measured intracellular metabolites displayed linear correlation between metabolite concentrations and cell numbers. We observed differences in amino acids, biogenic amines, and lipid levels between trypsinized and scraped cells.Conclusion
We offer a fast, robust, and validated normalization method for cell culture metabolomics samples and demonstrate the eligibility of the normalization of metabolomics data to the cell number. We show a cell line and metabolite-specific impact of the harvesting method on metabolite concentrations.4.
Rachel A. Spicer Christoph Steinbeck 《Metabolomics : Official journal of the Metabolomic Society》2018,14(1):16
Introduction
Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.Objectives
(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.Methods
A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.Results
Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.Conclusion
Further efforts are required to improve data sharing in metabolomics.5.
Background
Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. This novel technique helps us to understand gene regulation as well as gene by gene interactions more systematically. In the microarray experiment, however, many undesirable systematic variations are observed. Even in replicated experiment, some variations are commonly observed. Normalization is the process of removing some sources of variation which affect the measured gene expression levels. Although a number of normalization methods have been proposed, it has been difficult to decide which methods perform best. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization.Results
In this paper, we use the variability among the replicated slides to compare performance of normalization methods. We also compare normalization methods with regard to bias and mean square error using simulated data.Conclusions
Our results show that intensity-dependent normalization often performs better than global normalization methods, and that linear and nonlinear normalization methods perform similarly. These conclusions are based on analysis of 36 cDNA microarrays of 3,840 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells. Simulation studies confirm our findings.6.
Introduction
Untargeted metabolomics is a powerful tool for biological discoveries. To analyze the complex raw data, significant advances in computational approaches have been made, yet it is not clear how exhaustive and reliable the data analysis results are.Objectives
Assessment of the quality of raw data processing in untargeted metabolomics.Methods
Five published untargeted metabolomics studies, were reanalyzed.Results
Omissions of at least 50 relevant compounds from the original results as well as examples of representative mistakes were reported for each study.Conclusion
Incomplete raw data processing shows unexplored potential of current and legacy data.7.
Clara Pérez-Rambla Leonor Puchades-Carrasco María García-Flores José Rubio-Briones José Antonio López-Guerrero Antonio Pineda-Lucena 《Metabolomics : Official journal of the Metabolomic Society》2017,13(5):52
Introduction
Prostate cancer (PCa) is one of the most common malignancies in men worldwide. Serum prostate specific antigen (PSA) level has been extensively used as a biomarker to detect PCa. However, PSA is not cancer-specific and various non-malignant conditions, including benign prostatic hyperplasia (BPH), can cause a rise in PSA blood levels, thus leading to many false positive results.Objectives
In this study, we evaluated the potential of urinary metabolomic profiling for discriminating PCa from BPH.Methods
Urine samples from 64 PCa patients and 51 individuals diagnosed with BPH were analysed using 1H nuclear magnetic resonance (1H-NMR). Comparative analysis of urinary metabolomic profiles was carried out using multivariate and univariate statistical approaches.Results
The urine metabolomic profile of PCa patients is characterised by increased concentrations of branched-chain amino acids (BCAA), glutamate and pseudouridine, and decreased concentrations of glycine, dimethylglycine, fumarate and 4-imidazole-acetate compared with individuals diagnosed with BPH.Conclusion
PCa patients have a specific urinary metabolomic profile. The results of our study underscore the clinical potential of metabolomic profiling to uncover metabolic changes that could be useful to discriminate PCa from BPH in a clinical context.8.
Background
An important feature in many genomic studies is quality control and normalization. This is particularly important when analyzing epigenetic data, where the process of obtaining measurements can be bias prone. The GAW20 data was from the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN), a study with multigeneration families, where DNA cytosine-phosphate-guanine (CpG) methylation was measured pre- and posttreatment with fenofibrate. We performed quality control assessment of the GAW20 DNA methylation data, including normalization, assessment of batch effects and detection of sample swaps.Results
We show that even after normalization, the GOLDN methylation data has systematic differences pre- and posttreatment. Through investigation of (a) CpGs sites containing a single nucleotide polymorphism, (b) the stability of breeding values for methylation across time points, and (c) autosomal gender-associated CpGs, 13 sample swaps were detected, 11 of which were posttreatment.Conclusions
This paper demonstrates several ways to perform quality control of methylation data in the absence of raw data files and highlights the importance of normalization and quality control of the GAW20 methylation data from the GOLDN study.9.
Alysha M. De Livera Gavriel Olshansky Julie A. Simpson Darren J. Creek 《Metabolomics : Official journal of the Metabolomic Society》2018,14(5):54
Introduction
In metabolomics studies, unwanted variation inevitably arises from various sources. Normalization, that is the removal of unwanted variation, is an essential step in the statistical analysis of metabolomics data. However, metabolomics normalization is often considered an imprecise science due to the diverse sources of variation and the availability of a number of alternative strategies that may be implemented.Objectives
We highlight the need for comparative evaluation of different normalization methods and present software strategies to help ease this task for both data-oriented and biological researchers.Methods
We present NormalizeMets—a joint graphical user interface within the familiar Microsoft Excel and freely-available R software for comparative evaluation of different normalization methods. The NormalizeMets R package along with the vignette describing the workflow can be downloaded from https://cran.r-project.org/web/packages/NormalizeMets/. The Excel Interface and the Excel user guide are available on https://metabolomicstats.github.io/ExNormalizeMets.Results
NormalizeMets allows for comparative evaluation of normalization methods using criteria that depend on the given dataset and the ultimate research question. Hence it guides researchers to assess, select and implement a suitable normalization method using either the familiar Microsoft Excel and/or freely-available R software. In addition, the package can be used for visualisation of metabolomics data using interactive graphical displays and to obtain end statistical results for clustering, classification, biomarker identification adjusting for confounding variables, and correlation analysis.Conclusion
NormalizeMets is designed for comparative evaluation of normalization methods, and can also be used to obtain end statistical results. The use of freely-available R software offers an attractive proposition for programming-oriented researchers, and the Excel interface offers a familiar alternative to most biological researchers. The package handles the data locally in the user’s own computer allowing for reproducible code to be stored locally.10.
Jack W. KentJr 《BMC genetics》2016,17(Z2):S5
Background
New technologies for acquisition of genomic data, while offering unprecedented opportunities for genetic discovery, also impose severe burdens of interpretation andpenalties for multiple testing.Methods
The Pathway-based Analyses Group of the Genetic Analysis Workshop 19 (GAW19) sought reduction of multiple-testing burden through various approaches to aggregation of highdimensional data in pathways informed by prior biological knowledge.Results
Experimental methods testedincluded the use of "synthetic pathways" (random sets of genes) to estimate power and false-positive error rate of methods applied to simulated data; data reduction via independent components analysis, single-nucleotide polymorphism (SNP)-SNP interaction, and use of gene sets to estimate genetic similarity; and general assessment of the efficacy of prior biological knowledge to reduce the dimensionality of complex genomic data.Conclusions
The work of this group explored several promising approaches to managing high-dimensional data, with the caveat that these methods are necessarily constrained by the quality of external bioinformatic annotation.11.
Background
In recent years the visualization of biomagnetic measurement data by so-called pseudo current density maps or Hosaka-Cohen (HC) transformations became popular.Methods
The physical basis of these intuitive maps is clarified by means of analytically solvable problems.Results
Examples in magnetocardiography, magnetoencephalography and magnetoneurography demonstrate the usefulness of this method.Conclusion
Hardware realizations of the HC-transformation and some similar transformations are discussed which could advantageously support cross-platform comparability of biomagnetic measurements.12.
R. E. Patterson A. S. Kirpich J. P. Koelmel S. Kalavalapalli A. M. Morse K. Cusi N. E. Sunny L. M. McIntyre T. J. Garrett R. A. Yost 《Metabolomics : Official journal of the Metabolomic Society》2017,13(11):142
Introduction
Untargeted metabolomics workflows include numerous points where variance and systematic errors can be introduced. Due to the diversity of the lipidome, manual peak picking and quantitation using molecule specific internal standards is unrealistic, and therefore quality peak picking algorithms and further feature processing and normalization algorithms are important. Subsequent normalization, data filtering, statistical analysis, and biological interpretation are simplified when quality data acquisition and feature processing are employed.Objectives
Metrics for QC are important throughout the workflow. The robust workflow presented here provides techniques to ensure that QC checks are implemented throughout sample preparation, data acquisition, pre-processing, and analysis.Methods
The untargeted lipidomics workflow includes sample standardization prior to acquisition, blocks of QC standards and blanks run at systematic intervals between randomized blocks of experimental data, blank feature filtering (BFF) to remove features not originating from the sample, and QC analysis of data acquisition and processing.Results
The workflow was successfully applied to mouse liver samples, which were investigated to discern lipidomic changes throughout the development of nonalcoholic fatty liver disease (NAFLD). The workflow, including a novel filtering method, BFF, allows improved confidence in results and conclusions for lipidomic applications.Conclusion
Using a mouse model developed for the study of the transition of NAFLD from an early stage known as simple steatosis, to the later stage, nonalcoholic steatohepatitis, in combination with our novel workflow, we have identified phosphatidylcholines, phosphatidylethanolamines, and triacylglycerols that may contribute to disease onset and/or progression.13.
Thao Vu Eli Riekeberg Yumou Qiu Robert Powers 《Metabolomics : Official journal of the Metabolomic Society》2018,14(8):108
Introduction
Failure to properly account for normal systematic variations in OMICS datasets may result in misleading biological conclusions. Accordingly, normalization is a necessary step in the proper preprocessing of OMICS datasets. In this regards, an optimal normalization method will effectively reduce unwanted biases and increase the accuracy of downstream quantitative analyses. But, it is currently unclear which normalization method is best since each algorithm addresses systematic noise in different ways.Objective
Determine an optimal choice of a normalization method for the preprocessing of metabolomics datasets.Methods
Nine MVAPACK normalization algorithms were compared with simulated and experimental NMR spectra modified with added Gaussian noise and random dilution factors. Methods were evaluated based on an ability to recover the intensities of the true spectral peaks and the reproducibility of true classifying features from orthogonal projections to latent structures—discriminant analysis model (OPLS-DA).Results
Most normalization methods (except histogram matching) performed equally well at modest levels of signal variance. Only probabilistic quotient (PQ) and constant sum (CS) maintained the highest level of peak recovery (>?67%) and correlation with true loadings (>?0.6) at maximal noise.Conclusion
PQ and CS performed the best at recovering peak intensities and reproducing the true classifying features for an OPLS-DA model regardless of spectral noise level. Our findings suggest that performance is largely determined by the level of noise in the dataset, while the effect of dilution factors was negligible. A minimal allowable noise level of 20% was also identified for a valid NMR metabolomics dataset.14.
N. Cesbron A.-L. Royer Y. Guitton A. Sydor B. Le Bizec G. Dervilly-Pinel 《Metabolomics : Official journal of the Metabolomic Society》2017,13(8):99
Introduction
Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.Objectives
In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.Methods
The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.Results
A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.Conclusion
The workflow generated repeatable and informative fingerprints for robust metabolome characterization.15.
Alžběta Gardlo Age K. Smilde Karel Hron Marcela Hrdá Radana Karlíková David Friedecký Tomáš Adam 《Metabolomics : Official journal of the Metabolomic Society》2016,12(7):117
Introduction
One of the body fluids often used in metabolomics studies is urine. The concentrations of metabolites in urine are affected by hydration status of an individual, resulting in dilution differences. This requires therefore normalization of the data to correct for such differences. Two normalization techniques are commonly applied to urine samples prior to their further statistical analysis. First, AUC normalization aims to normalize a group of signals with peaks by standardizing the area under the curve (AUC) within a sample to the median, mean or any other proper representation of the amount of dilution. The second approach uses specific end-product metabolites such as creatinine and all intensities within a sample are expressed relative to the creatinine intensity.Objectives
Another way of looking at urine metabolomics data is by realizing that the ratios between peak intensities are the information-carrying features. This opens up possibilities to use another class of data analysis techniques designed to deal with such ratios: compositional data analysis. The aim of this paper is to develop PARAFAC modeling of three-way urine metabolomics data in the context of compositional data analysis and compare this with standard normalization techniques.Methods
In the compositional data analysis approach, special coordinate systems are defined to deal with the ratio problem. In essence, it comes down to using other distance measures than the Euclidian Distance that is used in the conventional analysis of metabolomic data.Results
We illustrate using this type of approach in combination with three-way methods (i.e. PARAFAC) of a longitudinal urine metabolomics study and two simulations. In both cases, the advantage of the compositional approach is established in terms of improved interpretability of the scores and loadings of the PARAFAC model.Conclusion
For urine metabolomics studies, we advocate the use of compositional data analysis approaches. They are easy to use, well established and proof to give reliable results.16.
Background
Cerebral infarction caused by different reasons seems differ in fibrinogen levels, so the current work intends to explore the relationship between the fibrinogen level and subtypes of the TOAST criteria in the acute stage of ischemic stroke.Methods
A total of 577 case research objects were treated acute ischemic stroke patients in our hospital from December 2008 to December 2010, and blood samples within 72 hours of the onset were processed with the fibrinogen (PT-der) measurement. Classification of selected patients according to the TOAST Criteria was conducted to study the distribution of fibrinogen levels in the stroke subtypes.Results
The distribution of fibrinogen levels in the subtypes was observed to be statistically insignificant.Conclusions
In the acute stage of ischemic stroke, fibrinogen level was not related to the subtypes of the TOAST criteria.17.
Nicholas J. Bond Albert Koulman Julian L. Griffin Zoe Hall 《Metabolomics : Official journal of the Metabolomic Society》2017,13(11):128
Introduction
Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools.Objectives
We have developed massPix—an R package for analysing and interpreting data from MSI of lipids in tissue.Methods
massPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries.Results
Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering.Conclusion
massPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications.18.
Yuji Sawada Hirokazu Tsukaya Yimeng Li Muneo Sato Kensuke Kawade 《Metabolomics : Official journal of the Metabolomic Society》2017,13(6):75
Introduction
In plant metabolomics, metabolite contents are often normalized by sample weight. However, accurate weighing of very small samples, such as individual Arabidopsis thaliana seeds (approximately 20 µg), is difficult, which may lead to irreproducible results.Objectives
We aimed to establish alternative normalization methods for seed-grain-based comparative metabolomics of A. thaliana.Methods
Arabidopsis thaliana seeds were assumed to have a prolate spheroid shape. Using a microscope image of each seed, the lengths of major and minor axes were measured by fitting a projected 2-dimensional shape of each seed as an ellipse. Metabolic profiles of individual diploid or tetraploid A. thaliana seeds were measured by our highly sensitive protocol (“widely targeted metabolomics”) that uses liquid chromatography coupled with tandem quadrupole mass spectrometry. Mass spectrometric analysis of 1 µL of solution extract identified more than 100 metabolites. The data were normalized by various seed-size measures, including seed volume (single-grain-based analysis). For comparison, metabolites were extracted from 4 mg of diploid and tetraploid A. thaliana seeds and their metabolic profiles were analyzed by normalization of weight (weight-based analysis).Results
A small number of metabolites showed statistically significant differences in the single-grain-based analysis compared to weight-based analysis. A total of 17 metabolites showed statistically different accumulation between ploidy types with similar fold changes in both analyses.Conclusion
Seed-size measures obtained by microscopic imaging were useful for data normalization. Single-grain-based analysis enables evaluation of metabolism of each seed and elucidates the metabolic profiles of precious bioresources by using small amounts of samples.19.
Andrea Padoan Daniela Basso Carlo-Federico Zambon Tommaso Prayer-Galetti Giorgio Arrigoni Dania Bozzato Stefania Moz Filiberto Zattoni Rino Bellocco Mario Plebani 《Clinical proteomics》2018,15(1):23
Background
Lower urinary tract symptoms (LUTS) and prostate specific antigen-based parameters seem to have only a limited utility for the differential diagnosis of prostate cancer (PCa). MALDI-TOF/MS peptidomic profiling could be a useful diagnostic tool for biomarker discovery, although reproducibility issues have limited its applicability until now. The current study aimed to evaluate a new MALDI-TOF/MS candidate biomarker.Methods
Within- and between-subject variability of MALDI-TOF/MS-based peptidomic urine and serum analyses were evaluated in 20 and 15 healthy donors, respectively. Normalizations and approaches for accounting below limit of detection (LOD) values were utilized to enhance reproducibility, while Monte Carlo experiments were performed to verify whether measurement error can be dealt with LOD data. Post-prostatic massage urine and serum samples from 148 LUTS patients were analysed using MALDI-TOF/MS. Regression-calibration and simulation and extrapolation methods were used to derive the unbiased association between peptidomic features and PCa.Results
Although the median normalized peptidomic variability was 24.9%, the within- and between-subject variability showed that median normalization, LOD adjustment, and log2 data transformation were the best combination in terms of reliability; in measurement error conditions, intraclass correlation coefficient was a reliable estimate when the LOD/2 was substituted for below LOD values. In the patients studied, 43 peptides were shared by the urine and serum, and several features were found to be associated with PCa. Only few serum features, however, show statistical significance after the multiple testing procedures were completed. Two serum fragmentation patterns corresponded to the complement C4-A.Conclusions
MALDI-TOF/MS serum peptidome profiling was more efficacious with respect to post-prostatic massage urine analysis in discriminating PCa.20.
Dorothea Lesche Roland Geyer Daniel Lienhard Christos T. Nakas Gaëlle Diserens Peter Vermathen Alexander B. Leichtle 《Metabolomics : Official journal of the Metabolomic Society》2016,12(10):159