共查询到20条相似文献,搜索用时 265 毫秒
1.
Amy McMillan Adebola E. Orimadegun Mark W. Sumarah Justin Renaud Magdalena Muc da Encarnacao Gregory B. Gloor Olusegun O. Akinyinka Gregor Reid Stephen J. Allen 《Metabolomics : Official journal of the Metabolomic Society》2017,13(2):13
Introduction
Severe acute malnutrition (SAM) is a major cause of child mortality worldwide, however the pathogenesis of SAM remains poorly understood. Recent studies have uncovered an altered gut microbiota composition in children with SAM, suggesting a role for microbes in the pathogenesis of malnutrition.Objectives
To elucidate the metabolic consequences of SAM and whether these changes are associated with changes in gut microbiota composition.Methods
We applied an untargeted multi-platform metabolomics approach [gas chromatography–mass spectrometry (GC-MS) and liquid chromatography–mass spectrometry (LC-MS)] to stool and plasma samples from 47 Nigerian children with SAM and 11 control children. The composition of the stool microbiota was assessed by 16S rRNA gene sequencing.Results
The plasma metabolome discriminated children with SAM from controls, while no significant differences were observed in the microbial or small molecule composition of stool. The abundance of 585 features in plasma were significantly altered in malnourished children (Wilcoxon test, FDR corrected P?<?0.1), representing approximately 15% of the metabolome. Consistent with previous studies, children with SAM exhibited a marked reduction in amino acids/dipeptides and phospholipids, and an increase in acylcarnitines. We also identified numerous metabolic perturbations which have not been reported previously, including increased disaccharides, truncated fibrinopeptides, angiotensin I, dihydroxybutyrate, lactate, and heme, and decreased bioactive lipids belonging to the eicosanoid and docosanoid family.Conclusion
Our findings provide a deeper understanding of the metabolic consequences of malnutrition. Further research is required to determine if specific metabolites may guide improved management, and/or act as novel biomarkers for assessing response to treatment.2.
Korey J. Brownstein Mahmoud Gargouri William R. Folk David R. Gang 《Metabolomics : Official journal of the Metabolomic Society》2017,13(11):133
Introduction
Botanicals containing iridoid and phenylethanoid/phenylpropanoid glycosides are used worldwide for the treatment of inflammatory musculoskeletal conditions that are primary causes of human years lived with disability, such as arthritis and lower back pain.Objectives
We report the analysis of candidate anti-inflammatory metabolites of several endemic Scrophularia species and Verbascum thapsus used medicinally by peoples of North America.Methods
Leaves, stems, and roots were analyzed by ultra-performance liquid chromatography-mass spectrometry (UPLC-MS) and partial least squares-discriminant analysis (PLS-DA) was performed in MetaboAnalyst 3.0 after processing the datasets in Progenesis QI.Results
Comparison of the datasets revealed significant and differential accumulation of iridoid and phenylethanoid/phenylpropanoid glycosides in the tissues of the endemic Scrophularia species and Verbascum thapsus.Conclusions
Our investigation identified several species of pharmacological interest as good sources for harpagoside and other important anti-inflammatory metabolites.3.
Background
Human cancers are complex ecosystems composed of cells with distinct molecular signatures. Such intratumoral heterogeneity poses a major challenge to cancer diagnosis and treatment. Recent advancements of single-cell techniques such as scRNA-seq have brought unprecedented insights into cellular heterogeneity. Subsequently, a challenging computational problem is to cluster high dimensional noisy datasets with substantially fewer cells than the number of genes.Methods
In this paper, we introduced a consensus clustering framework conCluster, for cancer subtype identification from single-cell RNA-seq data. Using an ensemble strategy, conCluster fuses multiple basic partitions to consensus clusters.Results
Applied to real cancer scRNA-seq datasets, conCluster can more accurately detect cancer subtypes than the widely used scRNA-seq clustering methods. Further, we conducted co-expression network analysis for the identified melanoma subtypes.Conclusions
Our analysis demonstrates that these subtypes exhibit distinct gene co-expression networks and significant gene sets with different functional enrichment.4.
Background
Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem.Results
We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches.Conclusion
This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.5.
Background
The DNase I hypersensitive sites (DHSs) are associated with the cis-regulatory DNA elements. An efficient method of identifying DHSs can enhance the understanding on the accessibility of chromatin. Despite a multitude of resources available on line including experimental datasets and computational tools, the complex language of DHSs remains incompletely understood.Methods
Here, we address this challenge using an approach based on a state-of-the-art machine learning method. We present a novel convolutional neural network (CNN) which combined Inception like networks with a gating mechanism for the response of multiple patterns and longterm association in DNA sequences to predict multi-scale DHSs in Arabidopsis, rice and Homo sapiens.Results
Our method obtains 0.961 area under curve (AUC) on Arabidopsis, 0.969 AUC on rice and 0.918 AUC on Homo sapiens.Conclusions
Our method provides an efficient and accurate way to identify multi-scale DHSs sequences by deep learning.6.
Patrick J. C. Tardivel Cécile Canlet Gaëlle Lefort Marie Tremblay-Franco Laurent Debrauwer Didier Concordet Rémi Servien 《Metabolomics : Official journal of the Metabolomic Society》2017,13(10):109
Introduction
Experiments in metabolomics rely on the identification and quantification of metabolites in complex biological mixtures. This remains one of the major challenges in NMR/mass spectrometry analysis of metabolic profiles. These features are mandatory to make metabolomics asserting a general approach to test a priori formulated hypotheses on the basis of exhaustive metabolome characterization rather than an exploratory tool dealing with unknown metabolic features.Objectives
In this article we propose a method, named ASICS, based on a strong statistical theory that handles automatically the metabolites identification and quantification in proton NMR spectra.Methods
A statistical linear model is built to explain a complex spectrum using a library containing pure metabolite spectra. This model can handle local or global chemical shift variations due to experimental conditions using a warping function. A statistical lasso-type estimator identifies and quantifies the metabolites in the complex spectrum. This estimator shows good statistical properties and handles peak overlapping issues.Results
The performances of the method were investigated on known mixtures (such as synthetic urine) and on plasma datasets from duck and human. Results show noteworthy performances, outperforming current existing methods.Conclusion
ASICS is a completely automated procedure to identify and quantify metabolites in 1H NMR spectra of biological mixtures. It will enable empowering NMR-based metabolomics by quickly and accurately helping experts to obtain metabolic profiles.7.
Nicholas J. Bond Albert Koulman Julian L. Griffin Zoe Hall 《Metabolomics : Official journal of the Metabolomic Society》2017,13(11):128
Introduction
Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools.Objectives
We have developed massPix—an R package for analysing and interpreting data from MSI of lipids in tissue.Methods
massPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries.Results
Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering.Conclusion
massPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications.8.
Thao Vu Eli Riekeberg Yumou Qiu Robert Powers 《Metabolomics : Official journal of the Metabolomic Society》2018,14(8):108
Introduction
Failure to properly account for normal systematic variations in OMICS datasets may result in misleading biological conclusions. Accordingly, normalization is a necessary step in the proper preprocessing of OMICS datasets. In this regards, an optimal normalization method will effectively reduce unwanted biases and increase the accuracy of downstream quantitative analyses. But, it is currently unclear which normalization method is best since each algorithm addresses systematic noise in different ways.Objective
Determine an optimal choice of a normalization method for the preprocessing of metabolomics datasets.Methods
Nine MVAPACK normalization algorithms were compared with simulated and experimental NMR spectra modified with added Gaussian noise and random dilution factors. Methods were evaluated based on an ability to recover the intensities of the true spectral peaks and the reproducibility of true classifying features from orthogonal projections to latent structures—discriminant analysis model (OPLS-DA).Results
Most normalization methods (except histogram matching) performed equally well at modest levels of signal variance. Only probabilistic quotient (PQ) and constant sum (CS) maintained the highest level of peak recovery (>?67%) and correlation with true loadings (>?0.6) at maximal noise.Conclusion
PQ and CS performed the best at recovering peak intensities and reproducing the true classifying features for an OPLS-DA model regardless of spectral noise level. Our findings suggest that performance is largely determined by the level of noise in the dataset, while the effect of dilution factors was negligible. A minimal allowable noise level of 20% was also identified for a valid NMR metabolomics dataset.9.
Background
Biomedical extraction based on supervised machine learning still faces the problem that a limited labeled dataset does not saturate the learning method. Many supervised learning algorithms for bio-event extraction have been affected by the data sparseness.Methods
In this study, a semi-supervised method for combining labeled data with large scale of unlabeled data is presented to improve the performance of biomedical event extraction. We propose a set of rich feature vector, including a variety of syntactic features and semantic features, such as N-gram features, walk subsequence features, predicate argument structure (PAS) features, especially some new features derived from a strategy named Event Feature Coupling Generalization (EFCG). The EFCG algorithm can create useful event recognition features by making use of the correlation between two sorts of original features explored from the labeled data, while the correlation is computed with the help of massive amounts of unlabeled data. This introduced EFCG approach aims to solve the data sparse problem caused by limited tagging corpus, and enables the new features to cover much more event related information with better generalization properties.Results
The effectiveness of our event extraction system is evaluated on the datasets from the BioNLP Shared Task 2011 and PubMed. Experimental results demonstrate the state-of-the-art performance in the fine-grained biomedical information extraction task.Conclusions
Limited labeled data could be combined with unlabeled data to tackle the data sparseness problem by means of our EFCG approach, and the classified capability of the model was enhanced through establishing a rich feature set by both labeled and unlabeled datasets. So this semi-supervised learning approach could go far towards improving the performance of the event extraction system. To the best of our knowledge, it was the first attempt at combining labeled and unlabeled data for tasks related biomedical event extraction.10.
Antonella Del-Corso Mario Cappiello Roberta Moschini Francesco Balestri Umberto Mura 《Metabolomics : Official journal of the Metabolomic Society》2018,14(1):2
Introduction
While the evolutionary adaptation of enzymes to their own substrates is a well assessed and rationalized field, how molecules have been originally selected in order to initiate and assemble convenient metabolic pathways is a fascinating, but still debated argument.Objectives
Aim of the present study is to give a rationale for the preferential selection of specific molecules to generate metabolic pathways.Methods
The comparison of structural features of molecules, through an inductive methodological approach, offer a reading key to cautiously propose a determining factor for their metabolic recruitment.Results
Starting with some commonplaces occurring in the structural representation of relevant carbohydrates, such as glucose, fructose and ribose, arguments are presented in associating stable structural determinants of these molecules and their peculiar occurrence in metabolic pathways.Conclusions
Among other possible factors, the reliability of the structural asset of a molecule may be relevant or its selection among structurally and, a priori, functionally similar molecules.11.
Yinglei Lai 《BMC bioinformatics》2017,18(3):69
Background
q-value is a widely used statistical method for estimating false discovery rate (FDR), which is a conventional significance measure in the analysis of genome-wide expression data. q-value is a random variable and it may underestimate FDR in practice. An underestimated FDR can lead to unexpected false discoveries in the follow-up validation experiments. This issue has not been well addressed in literature, especially in the situation when the permutation procedure is necessary for p-value calculation.Results
We proposed a statistical method for the conservative adjustment of q-value. In practice, it is usually necessary to calculate p-value by a permutation procedure. This was also considered in our adjustment method. We used simulation data as well as experimental microarray or sequencing data to illustrate the usefulness of our method.Conclusions
The conservativeness of our approach has been mathematically confirmed in this study. We have demonstrated the importance of conservative adjustment of q-value, particularly in the situation that the proportion of differentially expressed genes is small or the overall differential expression signal is weak.12.
Ekaterina?Smirnova Jamie?J.?Kwan Ryan?Siu Xin?Gao Georg?Zoidl Borries?Demeler Vivian?Saridakis Logan?W.?Donaldson
Background
CASKIN2 is a homolog of CASKIN1, a scaffolding protein that participates in a signaling network with CASK (calcium/calmodulin-dependent serine kinase). Despite a high level of homology between CASKIN2 and CASKIN1, CASKIN2 cannot bind CASK due to the absence of a CASK Interaction Domain and consequently, may have evolved undiscovered structural and functional distinctions.Results
We demonstrate that the crystal structure of the Sterile Alpha Motif (SAM) domain tandem (SAM1-SAM2) oligomer from CASKIN2 is different than CASKIN1, with the minimal repeating unit being a dimer, rather than a monomer. Analytical ultracentrifugation sedimentation velocity methods revealed differences in monomer/dimer equilibria across a range of concentrations and ionic strengths for the wild type CASKIN2 SAM tandem and a structure-directed double mutant that could not oligomerize. Further distinguishing CASKIN2 from CASKIN1, EGFP-tagged SAM tandem proteins expressed in Neuro2a cells produced punctae that were distinct both in shape and size.Conclusions
This study illustrates a new way in which neuronal SAM domains can assemble into large macromolecular assemblies that might concentrate and amplify synaptic responses.13.
Sonia Liggi Christine Hinz Zoe Hall Maria Laura Santoru Simone Poddighe John Fjeldsted Luigi Atzori Julian L. Griffin 《Metabolomics : Official journal of the Metabolomic Society》2018,14(4):52
Introduction
Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.Objectives
Merge in the same platform the steps required for metabolomics data processing.Methods
KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.Results
The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.Conclusion
KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.14.
Ferran Casbas Pinto Srinivarao Ravipati David A. Barrett T. Charles Hodgman 《Metabolomics : Official journal of the Metabolomic Society》2017,13(7):81
Introduction
It is difficult to elucidate the metabolic and regulatory factors causing lipidome perturbations.Objectives
This work simplifies this process.Methods
A method has been developed to query an online holistic lipid metabolic network (of 7923 metabolites) to extract the pathways that connect the input list of lipids.Results
The output enables pathway visualisation and the querying of other databases to identify potential regulators. When used to a study a plasma lipidome dataset of polycystic ovary syndrome, 14 enzymes were identified, of which 3 are linked to ELAVL1—an mRNA stabiliser.Conclusion
This method provides a simplified approach to identifying potential regulators causing lipid-profile perturbations.15.
Introduction
Untargeted metabolomics is a powerful tool for biological discoveries. To analyze the complex raw data, significant advances in computational approaches have been made, yet it is not clear how exhaustive and reliable the data analysis results are.Objectives
Assessment of the quality of raw data processing in untargeted metabolomics.Methods
Five published untargeted metabolomics studies, were reanalyzed.Results
Omissions of at least 50 relevant compounds from the original results as well as examples of representative mistakes were reported for each study.Conclusion
Incomplete raw data processing shows unexplored potential of current and legacy data.16.
Daniel Cañueto Josep Gómez Reza M. Salek Xavier Correig Nicolau Cañellas 《Metabolomics : Official journal of the Metabolomic Society》2018,14(3):24
Introduction
Adoption of automatic profiling tools for 1H-NMR-based metabolomic studies still lags behind other approaches in the absence of the flexibility and interactivity necessary to adapt to the properties of study data sets of complex matrices.Objectives
To provide an open source tool that fully integrates these needs and enables the reproducibility of the profiling process.Methods
rDolphin incorporates novel techniques to optimize exploratory analysis, metabolite identification, and validation of profiling output quality.Results
The information and quality achieved in two public datasets of complex matrices are maximized.Conclusion
rDolphin is an open-source R package (http://github.com/danielcanueto/rDolphin) able to provide the best balance between accuracy, reproducibility and ease of use.17.
Background
Measurement-unit conflicts are a perennial problem in integrative research domains such as clinical meta-analysis. As multi-national collaborations grow, as new measurement instruments appear, and as Linked Open Data infrastructures become increasingly pervasive, the number of such conflicts will similarly increase.Methods
We propose a generic approach to the problem of (a) encoding measurement units in datasets in a machine-readable manner, (b) detecting when a dataset contained mixtures of measurement units, and (c) automatically converting any conflicting units into a desired unit, as defined for a given study.Results
We utilized existing ontologies and standards for scientific data representation, measurement unit definition, and data manipulation to build a simple and flexible Semantic Web Service-based approach to measurement-unit harmonization. A cardiovascular patient cohort in which clinical measurements were recorded in a number of different units (e.g., mmHg and cmHg for blood pressure) was automatically classified into a number of clinical phenotypes, semantically defined using different measurement units.Conclusions
We demonstrate that through a combination of semantic standards and frameworks, unit integration problems can be automatically detected and resolved.18.
Daniel Drev Andrea Bileck Zeynep N. Erdem Thomas Mohr Gerald Timelthaler Andrea Beer Christopher Gerner Brigitte Marian 《Clinical proteomics》2017,14(1):33
Background
Cancer associated fibroblasts are activated in the tumor microenvironment and contribute to tumor progression, angiogenesis, extracellular matrix remodeling, and inflammation.Methods
To identify proteins characteristic for fibroblasts in colorectal cancer we used liquid chromatography-tandem mass spectrometry to derive protein abundance from whole-tissue homogenates of human colorectal cancer/normal mucosa pairs. Alterations of protein levels were determined by two-sided t test with greater than threefold difference and an FDR of < 0.05. Public available datasets were used to predict proteins of stromal origin and link protein with mRNA regulation. Immunohistochemistry confirmed the localization of selected proteins.Results
We identified a set of 24 proteins associated with inflammation, matrix organization, TGFβ receptor signaling and angiogenesis mainly originating from the stroma. Most prominent were increased abundance of SerpinB5 in the parenchyme and latent transforming growth factor β-binding protein, thrombospondin-B2, and secreted protein acidic-and-cysteine-rich in the stroma. Extracellular matrix remodeling involved collagens type VIII, XII, XIV, and VI as well as lysyl-oxidase-2. In silico analysis of mRNA levels demonstrated altered expression in the tumor and the adjacent normal tissue as compared to mucosa of healthy individuals indicating that inflammatory activation affected the surrounding tissue. Immunohistochemistry of 26 tumor specimen confirmed upregulation of SerpinB5, thrombospondin B2 and secreted protein acidic-and-cysteine-rich.Conclusions
This study demonstrates the feasibility of detecting tumor- and compartment-specific protein-signatures that are functionally meaningful by proteomic profiling of whole-tissue extracts together with mining of RNA expression datasets. The results provide the basis for further exploration of inflammation-related stromal markers in larger patient cohorts and experimental models.19.
Chao Xie Chin Lui Wesley Goi Daniel H. Huson Peter F. R. Little Rohan B. H. Williams 《BMC bioinformatics》2016,17(19):508
Background
Taxonomic profiling of microbial communities is often performed using small subunit ribosomal RNA (SSU) amplicon sequencing (16S or 18S), while environmental shotgun sequencing is often focused on functional analysis. Large shotgun datasets contain a significant number of SSU sequences and these can be exploited to perform an unbiased SSU--based taxonomic analysis.Results
Here we present a new program called RiboTagger that identifies and extracts taxonomically informative ribotags located in a specified variable region of the SSU gene in a high-throughput fashion.Conclusions
RiboTagger permits fast recovery of SSU-RNA sequences from shotgun nucleic acid surveys of complex microbial communities. The program targets all three domains of life, exhibits high sensitivity and specificity and is substantially faster than comparable programs.20.
Riccardo Di Guida Jasper Engel J. William Allwood Ralf J. M. Weber Martin R. Jones Ulf Sommer Mark R. Viant Warwick B. Dunn 《Metabolomics : Official journal of the Metabolomic Society》2016,12(5):93