期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Non-targeted UHPLC-MS metabolomic data processing methods: a comparative investigation of normalisation,missing value imputation,transformation and scaling

Riccardo Di Guida Jasper Engel J. William Allwood Ralf J. M. Weber Martin R. Jones Ulf Sommer Mark R. Viant

Warwick B. Dunn

《Metabolomics : Official journal of the Metabolomic Society》

Introduction

The generic metabolomics data processing workflow is constructed with a serial set of processes including peak picking, quality assurance, normalisation, missing value imputation, transformation and scaling. The combination of these processes should present the experimental data in an appropriate structure so to identify the biological changes in a valid and robust manner.

Objectives

Currently, different researchers apply different data processing methods and no assessment of the permutations applied to UHPLC-MS datasets has been published. Here we wish to define the most appropriate data processing workflow.

Methods

We assess the influence of normalisation, missing value imputation, transformation and scaling methods on univariate and multivariate analysis of UHPLC-MS datasets acquired for different mammalian samples.

Results

Our studies have shown that once data are filtered, missing values are not correlated with m/z, retention time or response. Following an exhaustive evaluation, we recommend PQN normalisation with no missing value imputation and no transformation or scaling for univariate analysis. For PCA we recommend applying PQN normalisation with Random Forest missing value imputation, glog transformation and no scaling method. For PLS-DA we recommend PQN normalisation, KNN as the missing value imputation method, generalised logarithm transformation and no scaling. These recommendations are based on searching for the biologically important metabolite features independent of their measured abundance.

Conclusion

The appropriate choice of normalisation, missing value imputation, transformation and scaling methods differs depending on the data analysis method and the choice of method is essential to maximise the biological derivations from UHPLC-MS datasets.

相似文献

2.

KniMet: a pipeline for the processing of chromatography–mass spectrometry metabolomics data

Sonia Liggi Christine Hinz Zoe Hall Maria Laura Santoru Simone Poddighe John Fjeldsted Luigi Atzori Julian L. Griffin 《Metabolomics : Official journal of the Metabolomic Society》2018,14(4):52

Introduction

Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.

Objectives

Merge in the same platform the steps required for metabolomics data processing.

Methods

KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.

Results

The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.

Conclusion

KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.

相似文献

3.

NS-kNN: a modified <Emphasis Type="Italic">k</Emphasis>-nearest neighbors approach for imputing metabolomics data

Justin Y. Lee Mark P. Styczynski 《Metabolomics : Official journal of the Metabolomic Society》2018,14(12):153

Introduction

A common problem in metabolomics data analysis is the existence of a substantial number of missing values, which can complicate, bias, or even prevent certain downstream analyses. One of the most widely-used solutions to this problem is imputation of missing values using a k-nearest neighbors (kNN) algorithm to estimate missing metabolite abundances. kNN implicitly assumes that missing values are uniformly distributed at random in the dataset, but this is typically not true in metabolomics, where many values are missing because they are below the limit of detection of the analytical instrumentation.

Objectives

Here, we explore the impact of nonuniformly distributed missing values (missing not at random, or MNAR) on imputation performance. We present a new model for generating synthetic missing data and a new algorithm, No-Skip kNN (NS-kNN), that accounts for MNAR values to provide more accurate imputations.

Methods

We compare the imputation errors of the original kNN algorithm using two distance metrics, NS-kNN, and a recently developed algorithm KNN-TN, when applied to multiple experimental datasets with different types and levels of missing data.

Results

Our results show that NS-kNN typically outperforms kNN when at least 20–30% of missing values in a dataset are MNAR. NS-kNN also has lower imputation errors than KNN-TN on realistic datasets when at least 50% of missing values are MNAR.

Conclusion

Accounting for the nonuniform distribution of missing values in metabolomics data can significantly improve the results of imputation algorithms. The NS-kNN method imputes missing metabolomics data more accurately than existing kNN-based approaches when used on realistic datasets.

相似文献

4.

Characterization of missing values in untargeted MS-based metabolomics data and evaluation of missing data handling strategies

Kieu Trinh Do Simone Wahl Johannes Raffler Sophie Molnos Michael Laimighofer Jerzy Adamski Karsten Suhre Konstantin Strauch Annette Peters Christian Gieger Claudia Langenberg Isobel D. Stewart Fabian J. Theis Harald Grallert Gabi Kastenmüller Jan Krumsiek 《Metabolomics : Official journal of the Metabolomic Society》2018,14(10):128

Background

Untargeted mass spectrometry (MS)-based metabolomics data often contain missing values that reduce statistical power and can introduce bias in biomedical studies. However, a systematic assessment of the various sources of missing values and strategies to handle these data has received little attention. Missing data can occur systematically, e.g. from run day-dependent effects due to limits of detection (LOD); or it can be random as, for instance, a consequence of sample preparation.

Methods

We investigated patterns of missing data in an MS-based metabolomics experiment of serum samples from the German KORA F4 cohort (n?=?1750). We then evaluated 31 imputation methods in a simulation framework and biologically validated the results by applying all imputation approaches to real metabolomics data. We examined the ability of each method to reconstruct biochemical pathways from data-driven correlation networks, and the ability of the method to increase statistical power while preserving the strength of established metabolic quantitative trait loci.

Results

Run day-dependent LOD-based missing data accounts for most missing values in the metabolomics dataset. Although multiple imputation by chained equations performed well in many scenarios, it is computationally and statistically challenging. K-nearest neighbors (KNN) imputation on observations with variable pre-selection showed robust performance across all evaluation schemes and is computationally more tractable.

Conclusion

Missing data in untargeted MS-based metabolomics data occur for various reasons. Based on our results, we recommend that KNN-based imputation is performed on observations with variable pre-selection since it showed robust results in all evaluation schemes.

相似文献

5.

A review of RCTs in four medical journals to assess the use of imputation to overcome missing data in quality of life outcomes

Shona Fielding Graeme Maclennan Jonathan A Cook Craig R Ramsay 《Trials》2008,9(1):51

Background

Randomised controlled trials (RCTs) are perceived as the gold-standard method for evaluating healthcare interventions, and increasingly include quality of life (QoL) measures. The observed results are susceptible to bias if a substantial proportion of outcome data are missing. The review aimed to determine whether imputation was used to deal with missing QoL outcomes.

Methods

A random selection of 285 RCTs published during 2005/6 in the British Medical Journal, Lancet, New England Journal of Medicine and Journal of American Medical Association were identified.

Results

QoL outcomes were reported in 61 (21%) trials. Six (10%) reported having no missing data, 20 (33%) reported ≤ 10% missing, eleven (18%) 11%–20% missing, and eleven (18%) reported >20% missing. Missingness was unclear in 13 (21%). Missing data were imputed in 19 (31%) of the 61 trials. Imputation was part of the primary analysis in 13 trials, but a sensitivity analysis in six. Last value carried forward was used in 12 trials and multiple imputation in two. Following imputation, the most common analysis method was analysis of covariance (10 trials).

Conclusion

The majority of studies did not impute missing data and carried out a complete-case analysis. For those studies that did impute missing data, researchers tended to prefer simpler methods of imputation, despite more sophisticated methods being available.

相似文献

6.

A Bregman-proximal point algorithm for robust non-negative matrix factorization with possible missing values and outliers - application to gene expression analysis

Stéphane Chrétien Christophe Guyeux Bastien Conesa Régis Delage-Mouroux Michèle Jouvenot Philippe Huetz Françoise Descôtes 《BMC bioinformatics》2016,17(8):284

Background

Non-Negative Matrix factorization has become an essential tool for feature extraction in a wide spectrum of applications. In the present work, our objective is to extend the applicability of the method to the case of missing and/or corrupted data due to outliers.

Results

An essential property for missing data imputation and detection of outliers is that the uncorrupted data matrix is low rank, i.e. has only a small number of degrees of freedom. We devise a new version of the Bregman proximal idea which preserves nonnegativity and mix it with the Augmented Lagrangian approach for simultaneous reconstruction of the features of interest and detection of the outliers using a sparsity promoting ? ₁ penality.

Conclusions

An application to the analysis of gene expression data of patients with bladder cancer is finally proposed.

相似文献

7.

A lost opportunity for science: journals promote data sharing in metabolomics but do not enforce it

Rachel A. Spicer Christoph Steinbeck 《Metabolomics : Official journal of the Metabolomic Society》2018,14(1):16

Introduction

Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.

Objectives

(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.

Methods

A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.

Results

Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.

Conclusion

Further efforts are required to improve data sharing in metabolomics.

相似文献

8.

Our local experience with the surgical treatment of ampullary cancer

Dimitrios Botsios Emmanouil Zacharakis Ioannis Lambrou Kostas Tsalis Emmanouil Christoforidis Stavros Kalfadis Evangelos Zacharakis Dimitrios Betsis Ioannis Dadoukis 《International Seminars in Surgical Oncology : ISSO》2005,2(1):16

Background

The aim of this study is to report the outcome after surgical treatment of 32 patients with ampullary cancers from 1990 to 1999.

Methods

Twenty-one of them underwent pancreaticoduodenectomy and 9 local excision of the ampullary lesion. The remaining 2 patients underwent palliative surgery.

Results

When the final histological diagnosis was compared with the preoperative histological finding on biopsy, accurate diagnosis was preoperatively established in 24 patients. The hospital morbidity was 18.8% as 9 complications occurred in 6 patients. Following local excision of the ampullary cancer, the survival rate at 3 and 5 years was 77.7% and 33.3% respectively. Among the patients that underwent Whipple's procedure, the 3-year survival rate was 76.2% and the 5-year survival rate 62%.

Conclusion

In this series, local resection was a safe option in patients with significant co-morbidity or small ampullary tumors less than 2 cm in size, and was associated with satisfactory long-term survival rates.

相似文献

9.

Untargeted metabolomics suffers from incomplete raw data processing

Richard Baran 《Metabolomics : Official journal of the Metabolomic Society》2017,13(9):107

Introduction

Untargeted metabolomics is a powerful tool for biological discoveries. To analyze the complex raw data, significant advances in computational approaches have been made, yet it is not clear how exhaustive and reliable the data analysis results are.

Objectives

Assessment of the quality of raw data processing in untargeted metabolomics.

Methods

Five published untargeted metabolomics studies, were reanalyzed.

Results

Omissions of at least 50 relevant compounds from the original results as well as examples of representative mistakes were reported for each study.

Conclusion

Incomplete raw data processing shows unexplored potential of current and legacy data.

相似文献

10.

Pseudo current density maps of electrophysiological heart,nerve or brain function and their physical basis

Wolfgang?Haberkorn Uwe?Steinhoff Martin?Burghoff Olaf?Kosch Andreas?Morguet Hans?Koch Email author 《Biomagnetic research and technology》2006,4(1):5

Background

In recent years the visualization of biomagnetic measurement data by so-called pseudo current density maps or Hosaka-Cohen (HC) transformations became popular.

Methods

The physical basis of these intuitive maps is clarified by means of analytically solvable problems.

Results

Examples in magnetocardiography, magnetoencephalography and magnetoneurography demonstrate the usefulness of this method.

Conclusion

Hardware realizations of the HC-transformation and some similar transformations are discussed which could advantageously support cross-platform comparability of biomagnetic measurements.

相似文献

11.

Integrated analysis of microRNA-target interactions with clinical outcomes for cancers

Je-Gun Joung Su Yeon Lee Hwa Jung Kang Ju Han Kim 《BMC medical genomics》2014,7(Z1):S10

Background

Clinical statement alone is not enough to predict the progression of disease. Instead, the gene expression profiles have been widely used to forecast clinical outcomes. Many genes related to survival have been identified, and recently miRNA expression signatures predicting patient survival have been also investigated for several cancers. However, miRNAs and their target genes associated with clinical outcomes have remained largely unexplored.

Methods

Here, we demonstrate a survival analysis based on the regulatory relationships of miRNAs and their target genes. The patient survivals for the two major cancers, ovarian cancer and glioblastoma multiforme (GBM), are investigated through the integrated analysis of miRNA-mRNA interaction pairs.

Results

We found that there is a larger survival difference between two patient groups with an inversely correlated expression profile of miRNA and mRNA. It supports the idea that signatures of miRNAs and their targets related to cancer progression can be detected via this approach.

Conclusions

This integrated analysis can help to discover coordinated expression signatures of miRNAs and their target mRNAs that can be employed for therapeutics in human cancers.

相似文献

12.

The non-negative matrix factorization toolbox for biological data mining

Yifeng?Li Email author Alioune?Ngom 《Source code for biology and medicine》2013,8(1):10

Background

Non-negative matrix factorization (NMF) has been introduced as an important method for mining biological data. Though there currently exists packages implemented in R and other programming languages, they either provide only a few optimization algorithms or focus on a specific application field. There does not exist a complete NMF package for the bioinformatics community, and in order to perform various data mining tasks on biological data.

Results

We provide a convenient MATLAB toolbox containing both the implementations of various NMF techniques and a variety of NMF-based data mining approaches for analyzing biological data. Data mining approaches implemented within the toolbox include data clustering and bi-clustering, feature extraction and selection, sample classification, missing values imputation, data visualization, and statistical comparison.

Conclusions

A series of analysis such as molecular pattern discovery, biological process identification, dimension reduction, disease prediction, visualization, and statistical comparison can be performed using this toolbox.

相似文献

13.

R-wave synchronised atrial pacing in pediatric patients with postoperative junctional ectopic tachycardia: the atrioventricular interval investigated by computational analysis and clinical evaluation

Andreas Entenmann Miriam Michel Bruno Ismer Roman Gebauer 《Biomedical engineering online》2017,16(1):139

Background

R-wave synchronised atrial pacing is an effective temporary pacing therapy in infants with postoperative junctional ectopic tachycardia. In the technique currently used, adverse short or long intervals between atrial pacing and ventricular sensing (AP–VS) may be observed during routine clinical practice.

Objectives

The aim of the study was to analyse outcomes of R-wave synchronised atrial pacing and the relationship between maximum tracking rates and AP–VS intervals.

Methods

Calculated AP–VS intervals were compared with those predicted by experienced pediatric cardiologist.

Results

A maximum tracking rate (MTR) set 10 bpm higher than the heart rate (HR) may result in undesirable short AP–VS intervals (minimum 83 ms). A MTR set 20 bpm above the HR is the hemodynamically better choice (minimum 96 ms). Effects of either setting on the AP–VS interval could not be predicted by experienced observers. In our newly proposed technique the AP–VS interval approaches 95 ms for HR > 210 bpm and 130 ms for HR < 130 bpm. The progression is linear and decreases strictly (? 0.4 ms/bpm) between the two extreme levels.

Conclusions

Adjusting the AP–VS interval in the currently used technique is complex and may imply unfavorable pacemaker settings. A new pacemaker design is advisable to allow direct control of the AP–VS interval.

相似文献

14.

Chilling slows anaerobic metabolism to improve anoxia tolerance of insects

Leigh Boardman Jesper G. Sørensen Vladimír Koštál Petr Šimek John S. Terblanche 《Metabolomics : Official journal of the Metabolomic Society》2016,12(12):176

Background

Insects are renowned for their ability to survive anoxia. Anoxia tolerance may be enhanced during chilling through metabolic suppression.

Aims

Here, the metabolomic response of insects to anoxia, both with and without chilling, for different durations (12–36 h) was examined to assess the potential cross-tolerance mechanisms.

Results

Chilling during anoxia (cold anoxia) significantly improved survival relative to anoxia at warmer temperatures. Reduced intermediate metabolites and increased lactic acid, indicating a switch to anaerobic metabolism, were characteristic of larvae in anoxia.

Conclusions

Anoxia tolerance was correlated survival improvements after cold anoxia were correlated with a reduction in anaerobic metabolism.

相似文献

15.

<Emphasis Type="Italic">massPix</Emphasis>: an R package for annotation and interpretation of mass spectrometry imaging data for lipidomics

Nicholas J. Bond Albert Koulman Julian L. Griffin Zoe Hall 《Metabolomics : Official journal of the Metabolomic Society》2017,13(11):128

Introduction

Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools.

Objectives

We have developed massPix—an R package for analysing and interpreting data from MSI of lipids in tissue.

Methods

massPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries.

Results

Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering.

Conclusion

massPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications.

相似文献

16.

Optimization of fecal sample preparation for untargeted LC-HRMS based metabolomics

N. Cesbron A.-L. Royer Y. Guitton A. Sydor B. Le Bizec G. Dervilly-Pinel 《Metabolomics : Official journal of the Metabolomic Society》2017,13(8):99

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.

相似文献

17.

Extrahepatic cholangiocarcinoma with prolonged survival: a case report

Mohammed Z. Al-Zahir Turki AlAmeel 《Journal of medical case reports》2017,11(1):357

Background

Cholangiocarcinoma has poor prognosis and short term-survival. Here, we report the case of a patient with unusually prolonged survival.

Case presentation

Our patient was a 56-year-old Arab man with a 6-month history of obstructive jaundice. A computed tomography scan of his abdomen revealed a mass at the confluence of the hepatic ducts with suspected malignant strictures on endoscopy. A positive tissue diagnosis was achieved more than 18 months after commencement of his symptoms. He remained functional throughout this period despite recurrent episodes of cholangitis.

Conclusions

Cholangiocarcinoma is a presumably fatal disease, especially because patients tend to present late with unresectable disease. Many patient-related and disease-related factors may alter survival.

相似文献

18.

TarMet: a reactive GUI tool for efficient and confident quantification of MS based targeted metabolic and stable isotope tracer analysis

Hongchao Ji Zhimin Zhang Hongmei Lu 《Metabolomics : Official journal of the Metabolomic Society》2018,14(5):68

Introduction

Untargeted and targeted analyses are two classes of metabolic study. Both strategies have been advanced by high resolution mass spectrometers coupled with chromatography, which have the advantages of high mass sensitivity and accuracy. State-of-art methods for mass spectrometric data sets do not always quantify metabolites of interest in a targeted assay efficiently and accurately.

Objectives

TarMet can quantify targeted metabolites as well as their isotopologues through a reactive and user-friendly graphical user interface.

Methods

TarMet accepts vendor-neutral data files (NetCDF, mzXML and mzML) as inputs. Then it extracts ion chromatograms, detects peak position and bounds and confirms the metabolites via the isotope patterns. It can integrate peak areas for all isotopologues automatically.

Results

TarMet detects more isotopologues and quantify them better than state-of-art methods, and it can process isotope tracer assay well.

Conclusion

TarMet is a better tool for targeted metabolic and stable isotope tracer analyses.

相似文献

19.

Does centrifugation matter? Centrifugal force and spinning time alter the plasma metabolome

Dorothea Lesche Roland Geyer Daniel Lienhard Christos T. Nakas Gaëlle Diserens Peter Vermathen Alexander B. Leichtle 《Metabolomics : Official journal of the Metabolomic Society》2016,12(10):159

Background

Centrifugation is an indispensable procedure for plasma sample preparation, but applied conditions can vary between labs.

Aim

Determine whether routinely used plasma centrifugation protocols (1500×g 10 min; 3000×g 5 min) influence non-targeted metabolomic analyses.

Methods

Nuclear magnetic resonance spectroscopy (NMR) and High Resolution Mass Spectrometry (HRMS) data were evaluated with sparse partial least squares discriminant analyses and compared with cell count measurements.

Results

Besides significant differences in platelet count, we identified substantial alterations in NMR and HRMS data related to the different centrifugation protocols.

Conclusion

Already minor differences in plasma centrifugation can significantly influence metabolomic patterns and potentially bias metabolomics studies.

相似文献

20.

HLA-matched sibling transplantation with G-CSF mobilized PBSCs and BM decreases GVHD in adult patients with severe aplastic anemia

Sun?Zi-Min Email author Liu?Hui-Lan Geng?Liang-Quan Wang?Xin-Bing Yao?Wen Liu?Xin Ding?Kai-Yang Han?Yong-Sheng Yang?Hui-Zhi Tang?Bo-lin Tong?Juan Zhu?Wei-Bo Wang?Zu-Yi 《Journal of hematology & oncology》2010,3(1):51

Background

Allogeneic hematopoietic stem cell transplantation (allo-HSCT) is an effective treatment for severe aplastic anemia (SAA). However, graft failure and graft-versus-host disease (GVHD) are major causes of the early morbidity in Allo-HSCT.

Methods

To reduce graft failure and GVHD, we treated fifteen patients with SAA using high- dose of HSCT with both G-CSF mobilized PB and BMSCs from HLA-identical siblings to treat patients with SAA.

Results

All patients had successful bone marrow engraftment. Only one patient had late rejection. Median time to ANC greater than 0.5 × 10⁹/L and platelet counts greater than 20 × 10⁹/L was 12 and 16.5 days, respectively. No acute GVHD was observed. The incidence of chronic GVHD was 6.67%. The total three-year probability of disease-free survival was 79.8%.

Conclusion

HSCT with both G-CSF mobilized PB and BMSCs is a promising approach for heavily transfused and/or allo-immunized patients with SAA.

相似文献