首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Introduction

Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.

Objectives

Merge in the same platform the steps required for metabolomics data processing.

Methods

KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.

Results

The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.

Conclusion

KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.
  相似文献   

2.

Introduction

Untargeted metabolomics studies for biomarker discovery often have hundreds to thousands of human samples. Data acquisition of large-scale samples has to be divided into several batches and may span from months to as long as several years. The signal drift of metabolites during data acquisition (intra- and inter-batch) is unavoidable and is a major confounding factor for large-scale metabolomics studies.

Objectives

We aim to develop a data normalization method to reduce unwanted variations and integrate multiple batches in large-scale metabolomics studies prior to statistical analyses.

Methods

We developed a machine learning algorithm-based method, support vector regression (SVR), for large-scale metabolomics data normalization and integration. An R package named MetNormalizer was developed and provided for data processing using SVR normalization.

Results

After SVR normalization, the portion of metabolite ion peaks with relative standard deviations (RSDs) less than 30 % increased to more than 90 % of the total peaks, which is much better than other common normalization methods. The reduction of unwanted analytical variations helps to improve the performance of multivariate statistical analyses, both unsupervised and supervised, in terms of classification and prediction accuracy so that subtle metabolic changes in epidemiological studies can be detected.

Conclusion

SVR normalization can effectively remove the unwanted intra- and inter-batch variations, and is much better than other common normalization methods.
  相似文献   

3.

Introduction

Although cultured cells are nowadays regularly analyzed by metabolomics technologies, some issues in study setup and data processing are still not resolved to complete satisfaction: a suitable harvesting method for adherent cells, a fast and robust method for data normalization, and the proof that metabolite levels can be normalized to cell number.

Objectives

We intended to develop a fast method for normalization of cell culture metabolomics samples, to analyze how metabolite levels correlate with cell numbers, and to elucidate the impact of the kind of harvesting on measured metabolite profiles.

Methods

We cultured four different human cell lines and used them to develop a fluorescence-based method for DNA quantification. Further, we assessed the correlation between metabolite levels and cell numbers and focused on the impact of the harvesting method (scraping or trypsinization) on the metabolite profile.

Results

We developed a fast, sensitive and robust fluorescence-based method for DNA quantification showing excellent linear correlation between fluorescence intensities and cell numbers for all cell lines. Furthermore, 82–97 % of the measured intracellular metabolites displayed linear correlation between metabolite concentrations and cell numbers. We observed differences in amino acids, biogenic amines, and lipid levels between trypsinized and scraped cells.

Conclusion

We offer a fast, robust, and validated normalization method for cell culture metabolomics samples and demonstrate the eligibility of the normalization of metabolomics data to the cell number. We show a cell line and metabolite-specific impact of the harvesting method on metabolite concentrations.
  相似文献   

4.

Introduction

Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.

Objectives

(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.

Methods

A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.

Results

Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.

Conclusion

Further efforts are required to improve data sharing in metabolomics.
  相似文献   

5.

Background

Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. This novel technique helps us to understand gene regulation as well as gene by gene interactions more systematically. In the microarray experiment, however, many undesirable systematic variations are observed. Even in replicated experiment, some variations are commonly observed. Normalization is the process of removing some sources of variation which affect the measured gene expression levels. Although a number of normalization methods have been proposed, it has been difficult to decide which methods perform best. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization.

Results

In this paper, we use the variability among the replicated slides to compare performance of normalization methods. We also compare normalization methods with regard to bias and mean square error using simulated data.

Conclusions

Our results show that intensity-dependent normalization often performs better than global normalization methods, and that linear and nonlinear normalization methods perform similarly. These conclusions are based on analysis of 36 cDNA microarrays of 3,840 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells. Simulation studies confirm our findings.
  相似文献   

6.

Introduction

Untargeted metabolomics is a powerful tool for biological discoveries. To analyze the complex raw data, significant advances in computational approaches have been made, yet it is not clear how exhaustive and reliable the data analysis results are.

Objectives

Assessment of the quality of raw data processing in untargeted metabolomics.

Methods

Five published untargeted metabolomics studies, were reanalyzed.

Results

Omissions of at least 50 relevant compounds from the original results as well as examples of representative mistakes were reported for each study.

Conclusion

Incomplete raw data processing shows unexplored potential of current and legacy data.
  相似文献   

7.

Introduction

Prostate cancer (PCa) is one of the most common malignancies in men worldwide. Serum prostate specific antigen (PSA) level has been extensively used as a biomarker to detect PCa. However, PSA is not cancer-specific and various non-malignant conditions, including benign prostatic hyperplasia (BPH), can cause a rise in PSA blood levels, thus leading to many false positive results.

Objectives

In this study, we evaluated the potential of urinary metabolomic profiling for discriminating PCa from BPH.

Methods

Urine samples from 64 PCa patients and 51 individuals diagnosed with BPH were analysed using 1H nuclear magnetic resonance (1H-NMR). Comparative analysis of urinary metabolomic profiles was carried out using multivariate and univariate statistical approaches.

Results

The urine metabolomic profile of PCa patients is characterised by increased concentrations of branched-chain amino acids (BCAA), glutamate and pseudouridine, and decreased concentrations of glycine, dimethylglycine, fumarate and 4-imidazole-acetate compared with individuals diagnosed with BPH.

Conclusion

PCa patients have a specific urinary metabolomic profile. The results of our study underscore the clinical potential of metabolomic profiling to uncover metabolic changes that could be useful to discriminate PCa from BPH in a clinical context.
  相似文献   

8.

Background

An important feature in many genomic studies is quality control and normalization. This is particularly important when analyzing epigenetic data, where the process of obtaining measurements can be bias prone. The GAW20 data was from the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN), a study with multigeneration families, where DNA cytosine-phosphate-guanine (CpG) methylation was measured pre- and posttreatment with fenofibrate. We performed quality control assessment of the GAW20 DNA methylation data, including normalization, assessment of batch effects and detection of sample swaps.

Results

We show that even after normalization, the GOLDN methylation data has systematic differences pre- and posttreatment. Through investigation of (a) CpGs sites containing a single nucleotide polymorphism, (b) the stability of breeding values for methylation across time points, and (c) autosomal gender-associated CpGs, 13 sample swaps were detected, 11 of which were posttreatment.

Conclusions

This paper demonstrates several ways to perform quality control of methylation data in the absence of raw data files and highlights the importance of normalization and quality control of the GAW20 methylation data from the GOLDN study.
  相似文献   

9.

Introduction

In metabolomics studies, unwanted variation inevitably arises from various sources. Normalization, that is the removal of unwanted variation, is an essential step in the statistical analysis of metabolomics data. However, metabolomics normalization is often considered an imprecise science due to the diverse sources of variation and the availability of a number of alternative strategies that may be implemented.

Objectives

We highlight the need for comparative evaluation of different normalization methods and present software strategies to help ease this task for both data-oriented and biological researchers.

Methods

We present NormalizeMets—a joint graphical user interface within the familiar Microsoft Excel and freely-available R software for comparative evaluation of different normalization methods. The NormalizeMets R package along with the vignette describing the workflow can be downloaded from https://cran.r-project.org/web/packages/NormalizeMets/. The Excel Interface and the Excel user guide are available on https://metabolomicstats.github.io/ExNormalizeMets.

Results

NormalizeMets allows for comparative evaluation of normalization methods using criteria that depend on the given dataset and the ultimate research question. Hence it guides researchers to assess, select and implement a suitable normalization method using either the familiar Microsoft Excel and/or freely-available R software. In addition, the package can be used for visualisation of metabolomics data using interactive graphical displays and to obtain end statistical results for clustering, classification, biomarker identification adjusting for confounding variables, and correlation analysis.

Conclusion

NormalizeMets is designed for comparative evaluation of normalization methods, and can also be used to obtain end statistical results. The use of freely-available R software offers an attractive proposition for programming-oriented researchers, and the Excel interface offers a familiar alternative to most biological researchers. The package handles the data locally in the user’s own computer allowing for reproducible code to be stored locally.
  相似文献   

10.

Background

New technologies for acquisition of genomic data, while offering unprecedented opportunities for genetic discovery, also impose severe burdens of interpretation andpenalties for multiple testing.

Methods

The Pathway-based Analyses Group of the Genetic Analysis Workshop 19 (GAW19) sought reduction of multiple-testing burden through various approaches to aggregation of highdimensional data in pathways informed by prior biological knowledge.

Results

Experimental methods testedincluded the use of "synthetic pathways" (random sets of genes) to estimate power and false-positive error rate of methods applied to simulated data; data reduction via independent components analysis, single-nucleotide polymorphism (SNP)-SNP interaction, and use of gene sets to estimate genetic similarity; and general assessment of the efficacy of prior biological knowledge to reduce the dimensionality of complex genomic data.

Conclusions

The work of this group explored several promising approaches to managing high-dimensional data, with the caveat that these methods are necessarily constrained by the quality of external bioinformatic annotation.
  相似文献   

11.

Background

In recent years the visualization of biomagnetic measurement data by so-called pseudo current density maps or Hosaka-Cohen (HC) transformations became popular.

Methods

The physical basis of these intuitive maps is clarified by means of analytically solvable problems.

Results

Examples in magnetocardiography, magnetoencephalography and magnetoneurography demonstrate the usefulness of this method.

Conclusion

Hardware realizations of the HC-transformation and some similar transformations are discussed which could advantageously support cross-platform comparability of biomagnetic measurements.
  相似文献   

12.

Introduction

Untargeted metabolomics workflows include numerous points where variance and systematic errors can be introduced. Due to the diversity of the lipidome, manual peak picking and quantitation using molecule specific internal standards is unrealistic, and therefore quality peak picking algorithms and further feature processing and normalization algorithms are important. Subsequent normalization, data filtering, statistical analysis, and biological interpretation are simplified when quality data acquisition and feature processing are employed.

Objectives

Metrics for QC are important throughout the workflow. The robust workflow presented here provides techniques to ensure that QC checks are implemented throughout sample preparation, data acquisition, pre-processing, and analysis.

Methods

The untargeted lipidomics workflow includes sample standardization prior to acquisition, blocks of QC standards and blanks run at systematic intervals between randomized blocks of experimental data, blank feature filtering (BFF) to remove features not originating from the sample, and QC analysis of data acquisition and processing.

Results

The workflow was successfully applied to mouse liver samples, which were investigated to discern lipidomic changes throughout the development of nonalcoholic fatty liver disease (NAFLD). The workflow, including a novel filtering method, BFF, allows improved confidence in results and conclusions for lipidomic applications.

Conclusion

Using a mouse model developed for the study of the transition of NAFLD from an early stage known as simple steatosis, to the later stage, nonalcoholic steatohepatitis, in combination with our novel workflow, we have identified phosphatidylcholines, phosphatidylethanolamines, and triacylglycerols that may contribute to disease onset and/or progression.
  相似文献   

13.

Introduction

Failure to properly account for normal systematic variations in OMICS datasets may result in misleading biological conclusions. Accordingly, normalization is a necessary step in the proper preprocessing of OMICS datasets. In this regards, an optimal normalization method will effectively reduce unwanted biases and increase the accuracy of downstream quantitative analyses. But, it is currently unclear which normalization method is best since each algorithm addresses systematic noise in different ways.

Objective

Determine an optimal choice of a normalization method for the preprocessing of metabolomics datasets.

Methods

Nine MVAPACK normalization algorithms were compared with simulated and experimental NMR spectra modified with added Gaussian noise and random dilution factors. Methods were evaluated based on an ability to recover the intensities of the true spectral peaks and the reproducibility of true classifying features from orthogonal projections to latent structures—discriminant analysis model (OPLS-DA).

Results

Most normalization methods (except histogram matching) performed equally well at modest levels of signal variance. Only probabilistic quotient (PQ) and constant sum (CS) maintained the highest level of peak recovery (>?67%) and correlation with true loadings (>?0.6) at maximal noise.

Conclusion

PQ and CS performed the best at recovering peak intensities and reproducing the true classifying features for an OPLS-DA model regardless of spectral noise level. Our findings suggest that performance is largely determined by the level of noise in the dataset, while the effect of dilution factors was negligible. A minimal allowable noise level of 20% was also identified for a valid NMR metabolomics dataset.
  相似文献   

14.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

15.

Introduction

One of the body fluids often used in metabolomics studies is urine. The concentrations of metabolites in urine are affected by hydration status of an individual, resulting in dilution differences. This requires therefore normalization of the data to correct for such differences. Two normalization techniques are commonly applied to urine samples prior to their further statistical analysis. First, AUC normalization aims to normalize a group of signals with peaks by standardizing the area under the curve (AUC) within a sample to the median, mean or any other proper representation of the amount of dilution. The second approach uses specific end-product metabolites such as creatinine and all intensities within a sample are expressed relative to the creatinine intensity.

Objectives

Another way of looking at urine metabolomics data is by realizing that the ratios between peak intensities are the information-carrying features. This opens up possibilities to use another class of data analysis techniques designed to deal with such ratios: compositional data analysis. The aim of this paper is to develop PARAFAC modeling of three-way urine metabolomics data in the context of compositional data analysis and compare this with standard normalization techniques.

Methods

In the compositional data analysis approach, special coordinate systems are defined to deal with the ratio problem. In essence, it comes down to using other distance measures than the Euclidian Distance that is used in the conventional analysis of metabolomic data.

Results

We illustrate using this type of approach in combination with three-way methods (i.e. PARAFAC) of a longitudinal urine metabolomics study and two simulations. In both cases, the advantage of the compositional approach is established in terms of improved interpretability of the scores and loadings of the PARAFAC model.

Conclusion

For urine metabolomics studies, we advocate the use of compositional data analysis approaches. They are easy to use, well established and proof to give reliable results.
  相似文献   

16.

Background

Cerebral infarction caused by different reasons seems differ in fibrinogen levels, so the current work intends to explore the relationship between the fibrinogen level and subtypes of the TOAST criteria in the acute stage of ischemic stroke.

Methods

A total of 577 case research objects were treated acute ischemic stroke patients in our hospital from December 2008 to December 2010, and blood samples within 72 hours of the onset were processed with the fibrinogen (PT-der) measurement. Classification of selected patients according to the TOAST Criteria was conducted to study the distribution of fibrinogen levels in the stroke subtypes.

Results

The distribution of fibrinogen levels in the subtypes was observed to be statistically insignificant.

Conclusions

In the acute stage of ischemic stroke, fibrinogen level was not related to the subtypes of the TOAST criteria.
  相似文献   

17.

Introduction

Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools.

Objectives

We have developed massPix—an R package for analysing and interpreting data from MSI of lipids in tissue.

Methods

massPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries.

Results

Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering.

Conclusion

massPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications.
  相似文献   

18.

Introduction

In plant metabolomics, metabolite contents are often normalized by sample weight. However, accurate weighing of very small samples, such as individual Arabidopsis thaliana seeds (approximately 20 µg), is difficult, which may lead to irreproducible results.

Objectives

We aimed to establish alternative normalization methods for seed-grain-based comparative metabolomics of A. thaliana.

Methods

Arabidopsis thaliana seeds were assumed to have a prolate spheroid shape. Using a microscope image of each seed, the lengths of major and minor axes were measured by fitting a projected 2-dimensional shape of each seed as an ellipse. Metabolic profiles of individual diploid or tetraploid A. thaliana seeds were measured by our highly sensitive protocol (“widely targeted metabolomics”) that uses liquid chromatography coupled with tandem quadrupole mass spectrometry. Mass spectrometric analysis of 1 µL of solution extract identified more than 100 metabolites. The data were normalized by various seed-size measures, including seed volume (single-grain-based analysis). For comparison, metabolites were extracted from 4 mg of diploid and tetraploid A. thaliana seeds and their metabolic profiles were analyzed by normalization of weight (weight-based analysis).

Results

A small number of metabolites showed statistically significant differences in the single-grain-based analysis compared to weight-based analysis. A total of 17 metabolites showed statistically different accumulation between ploidy types with similar fold changes in both analyses.

Conclusion

Seed-size measures obtained by microscopic imaging were useful for data normalization. Single-grain-based analysis enables evaluation of metabolism of each seed and elucidates the metabolic profiles of precious bioresources by using small amounts of samples.
  相似文献   

19.

Background

Lower urinary tract symptoms (LUTS) and prostate specific antigen-based parameters seem to have only a limited utility for the differential diagnosis of prostate cancer (PCa). MALDI-TOF/MS peptidomic profiling could be a useful diagnostic tool for biomarker discovery, although reproducibility issues have limited its applicability until now. The current study aimed to evaluate a new MALDI-TOF/MS candidate biomarker.

Methods

Within- and between-subject variability of MALDI-TOF/MS-based peptidomic urine and serum analyses were evaluated in 20 and 15 healthy donors, respectively. Normalizations and approaches for accounting below limit of detection (LOD) values were utilized to enhance reproducibility, while Monte Carlo experiments were performed to verify whether measurement error can be dealt with LOD data. Post-prostatic massage urine and serum samples from 148 LUTS patients were analysed using MALDI-TOF/MS. Regression-calibration and simulation and extrapolation methods were used to derive the unbiased association between peptidomic features and PCa.

Results

Although the median normalized peptidomic variability was 24.9%, the within- and between-subject variability showed that median normalization, LOD adjustment, and log2 data transformation were the best combination in terms of reliability; in measurement error conditions, intraclass correlation coefficient was a reliable estimate when the LOD/2 was substituted for below LOD values. In the patients studied, 43 peptides were shared by the urine and serum, and several features were found to be associated with PCa. Only few serum features, however, show statistical significance after the multiple testing procedures were completed. Two serum fragmentation patterns corresponded to the complement C4-A.

Conclusions

MALDI-TOF/MS serum peptidome profiling was more efficacious with respect to post-prostatic massage urine analysis in discriminating PCa.
  相似文献   

20.

Background

Centrifugation is an indispensable procedure for plasma sample preparation, but applied conditions can vary between labs.

Aim

Determine whether routinely used plasma centrifugation protocols (1500×g 10 min; 3000×g 5 min) influence non-targeted metabolomic analyses.

Methods

Nuclear magnetic resonance spectroscopy (NMR) and High Resolution Mass Spectrometry (HRMS) data were evaluated with sparse partial least squares discriminant analyses and compared with cell count measurements.

Results

Besides significant differences in platelet count, we identified substantial alterations in NMR and HRMS data related to the different centrifugation protocols.

Conclusion

Already minor differences in plasma centrifugation can significantly influence metabolomic patterns and potentially bias metabolomics studies.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号