首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.

Introduction

Severe acute malnutrition (SAM) is a major cause of child mortality worldwide, however the pathogenesis of SAM remains poorly understood. Recent studies have uncovered an altered gut microbiota composition in children with SAM, suggesting a role for microbes in the pathogenesis of malnutrition.

Objectives

To elucidate the metabolic consequences of SAM and whether these changes are associated with changes in gut microbiota composition.

Methods

We applied an untargeted multi-platform metabolomics approach [gas chromatography–mass spectrometry (GC-MS) and liquid chromatography–mass spectrometry (LC-MS)] to stool and plasma samples from 47 Nigerian children with SAM and 11 control children. The composition of the stool microbiota was assessed by 16S rRNA gene sequencing.

Results

The plasma metabolome discriminated children with SAM from controls, while no significant differences were observed in the microbial or small molecule composition of stool. The abundance of 585 features in plasma were significantly altered in malnourished children (Wilcoxon test, FDR corrected P?<?0.1), representing approximately 15% of the metabolome. Consistent with previous studies, children with SAM exhibited a marked reduction in amino acids/dipeptides and phospholipids, and an increase in acylcarnitines. We also identified numerous metabolic perturbations which have not been reported previously, including increased disaccharides, truncated fibrinopeptides, angiotensin I, dihydroxybutyrate, lactate, and heme, and decreased bioactive lipids belonging to the eicosanoid and docosanoid family.

Conclusion

Our findings provide a deeper understanding of the metabolic consequences of malnutrition. Further research is required to determine if specific metabolites may guide improved management, and/or act as novel biomarkers for assessing response to treatment.
  相似文献   

2.

Introduction

Botanicals containing iridoid and phenylethanoid/phenylpropanoid glycosides are used worldwide for the treatment of inflammatory musculoskeletal conditions that are primary causes of human years lived with disability, such as arthritis and lower back pain.

Objectives

We report the analysis of candidate anti-inflammatory metabolites of several endemic Scrophularia species and Verbascum thapsus used medicinally by peoples of North America.

Methods

Leaves, stems, and roots were analyzed by ultra-performance liquid chromatography-mass spectrometry (UPLC-MS) and partial least squares-discriminant analysis (PLS-DA) was performed in MetaboAnalyst 3.0 after processing the datasets in Progenesis QI.

Results

Comparison of the datasets revealed significant and differential accumulation of iridoid and phenylethanoid/phenylpropanoid glycosides in the tissues of the endemic Scrophularia species and Verbascum thapsus.

Conclusions

Our investigation identified several species of pharmacological interest as good sources for harpagoside and other important anti-inflammatory metabolites.
  相似文献   

3.

Background

Human cancers are complex ecosystems composed of cells with distinct molecular signatures. Such intratumoral heterogeneity poses a major challenge to cancer diagnosis and treatment. Recent advancements of single-cell techniques such as scRNA-seq have brought unprecedented insights into cellular heterogeneity. Subsequently, a challenging computational problem is to cluster high dimensional noisy datasets with substantially fewer cells than the number of genes.

Methods

In this paper, we introduced a consensus clustering framework conCluster, for cancer subtype identification from single-cell RNA-seq data. Using an ensemble strategy, conCluster fuses multiple basic partitions to consensus clusters.

Results

Applied to real cancer scRNA-seq datasets, conCluster can more accurately detect cancer subtypes than the widely used scRNA-seq clustering methods. Further, we conducted co-expression network analysis for the identified melanoma subtypes.

Conclusions

Our analysis demonstrates that these subtypes exhibit distinct gene co-expression networks and significant gene sets with different functional enrichment.
  相似文献   

4.
Xu M  Zhu M  Zhang L 《BMC genomics》2008,9(Z2):S18

Background

Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem.

Results

We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches.

Conclusion

This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.
  相似文献   

5.
Lyu  Chuqiao  Wang  Lei  Zhang  Juhua 《BMC genomics》2018,19(10):905-165

Background

The DNase I hypersensitive sites (DHSs) are associated with the cis-regulatory DNA elements. An efficient method of identifying DHSs can enhance the understanding on the accessibility of chromatin. Despite a multitude of resources available on line including experimental datasets and computational tools, the complex language of DHSs remains incompletely understood.

Methods

Here, we address this challenge using an approach based on a state-of-the-art machine learning method. We present a novel convolutional neural network (CNN) which combined Inception like networks with a gating mechanism for the response of multiple patterns and longterm association in DNA sequences to predict multi-scale DHSs in Arabidopsis, rice and Homo sapiens.

Results

Our method obtains 0.961 area under curve (AUC) on Arabidopsis, 0.969 AUC on rice and 0.918 AUC on Homo sapiens.

Conclusions

Our method provides an efficient and accurate way to identify multi-scale DHSs sequences by deep learning.
  相似文献   

6.

Introduction

Experiments in metabolomics rely on the identification and quantification of metabolites in complex biological mixtures. This remains one of the major challenges in NMR/mass spectrometry analysis of metabolic profiles. These features are mandatory to make metabolomics asserting a general approach to test a priori formulated hypotheses on the basis of exhaustive metabolome characterization rather than an exploratory tool dealing with unknown metabolic features.

Objectives

In this article we propose a method, named ASICS, based on a strong statistical theory that handles automatically the metabolites identification and quantification in proton NMR spectra.

Methods

A statistical linear model is built to explain a complex spectrum using a library containing pure metabolite spectra. This model can handle local or global chemical shift variations due to experimental conditions using a warping function. A statistical lasso-type estimator identifies and quantifies the metabolites in the complex spectrum. This estimator shows good statistical properties and handles peak overlapping issues.

Results

The performances of the method were investigated on known mixtures (such as synthetic urine) and on plasma datasets from duck and human. Results show noteworthy performances, outperforming current existing methods.

Conclusion

ASICS is a completely automated procedure to identify and quantify metabolites in 1H NMR spectra of biological mixtures. It will enable empowering NMR-based metabolomics by quickly and accurately helping experts to obtain metabolic profiles.
  相似文献   

7.

Introduction

Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools.

Objectives

We have developed massPix—an R package for analysing and interpreting data from MSI of lipids in tissue.

Methods

massPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries.

Results

Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering.

Conclusion

massPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications.
  相似文献   

8.

Introduction

Failure to properly account for normal systematic variations in OMICS datasets may result in misleading biological conclusions. Accordingly, normalization is a necessary step in the proper preprocessing of OMICS datasets. In this regards, an optimal normalization method will effectively reduce unwanted biases and increase the accuracy of downstream quantitative analyses. But, it is currently unclear which normalization method is best since each algorithm addresses systematic noise in different ways.

Objective

Determine an optimal choice of a normalization method for the preprocessing of metabolomics datasets.

Methods

Nine MVAPACK normalization algorithms were compared with simulated and experimental NMR spectra modified with added Gaussian noise and random dilution factors. Methods were evaluated based on an ability to recover the intensities of the true spectral peaks and the reproducibility of true classifying features from orthogonal projections to latent structures—discriminant analysis model (OPLS-DA).

Results

Most normalization methods (except histogram matching) performed equally well at modest levels of signal variance. Only probabilistic quotient (PQ) and constant sum (CS) maintained the highest level of peak recovery (>?67%) and correlation with true loadings (>?0.6) at maximal noise.

Conclusion

PQ and CS performed the best at recovering peak intensities and reproducing the true classifying features for an OPLS-DA model regardless of spectral noise level. Our findings suggest that performance is largely determined by the level of noise in the dataset, while the effect of dilution factors was negligible. A minimal allowable noise level of 20% was also identified for a valid NMR metabolomics dataset.
  相似文献   

9.

Background

Biomedical extraction based on supervised machine learning still faces the problem that a limited labeled dataset does not saturate the learning method. Many supervised learning algorithms for bio-event extraction have been affected by the data sparseness.

Methods

In this study, a semi-supervised method for combining labeled data with large scale of unlabeled data is presented to improve the performance of biomedical event extraction. We propose a set of rich feature vector, including a variety of syntactic features and semantic features, such as N-gram features, walk subsequence features, predicate argument structure (PAS) features, especially some new features derived from a strategy named Event Feature Coupling Generalization (EFCG). The EFCG algorithm can create useful event recognition features by making use of the correlation between two sorts of original features explored from the labeled data, while the correlation is computed with the help of massive amounts of unlabeled data. This introduced EFCG approach aims to solve the data sparse problem caused by limited tagging corpus, and enables the new features to cover much more event related information with better generalization properties.

Results

The effectiveness of our event extraction system is evaluated on the datasets from the BioNLP Shared Task 2011 and PubMed. Experimental results demonstrate the state-of-the-art performance in the fine-grained biomedical information extraction task.

Conclusions

Limited labeled data could be combined with unlabeled data to tackle the data sparseness problem by means of our EFCG approach, and the classified capability of the model was enhanced through establishing a rich feature set by both labeled and unlabeled datasets. So this semi-supervised learning approach could go far towards improving the performance of the event extraction system. To the best of our knowledge, it was the first attempt at combining labeled and unlabeled data for tasks related biomedical event extraction.
  相似文献   

10.

Introduction

While the evolutionary adaptation of enzymes to their own substrates is a well assessed and rationalized field, how molecules have been originally selected in order to initiate and assemble convenient metabolic pathways is a fascinating, but still debated argument.

Objectives

Aim of the present study is to give a rationale for the preferential selection of specific molecules to generate metabolic pathways.

Methods

The comparison of structural features of molecules, through an inductive methodological approach, offer a reading key to cautiously propose a determining factor for their metabolic recruitment.

Results

Starting with some commonplaces occurring in the structural representation of relevant carbohydrates, such as glucose, fructose and ribose, arguments are presented in associating stable structural determinants of these molecules and their peculiar occurrence in metabolic pathways.

Conclusions

Among other possible factors, the reliability of the structural asset of a molecule may be relevant or its selection among structurally and, a priori, functionally similar molecules.
  相似文献   

11.

Background

q-value is a widely used statistical method for estimating false discovery rate (FDR), which is a conventional significance measure in the analysis of genome-wide expression data. q-value is a random variable and it may underestimate FDR in practice. An underestimated FDR can lead to unexpected false discoveries in the follow-up validation experiments. This issue has not been well addressed in literature, especially in the situation when the permutation procedure is necessary for p-value calculation.

Results

We proposed a statistical method for the conservative adjustment of q-value. In practice, it is usually necessary to calculate p-value by a permutation procedure. This was also considered in our adjustment method. We used simulation data as well as experimental microarray or sequencing data to illustrate the usefulness of our method.

Conclusions

The conservativeness of our approach has been mathematically confirmed in this study. We have demonstrated the importance of conservative adjustment of q-value, particularly in the situation that the proportion of differentially expressed genes is small or the overall differential expression signal is weak.
  相似文献   

12.

Background

CASKIN2 is a homolog of CASKIN1, a scaffolding protein that participates in a signaling network with CASK (calcium/calmodulin-dependent serine kinase). Despite a high level of homology between CASKIN2 and CASKIN1, CASKIN2 cannot bind CASK due to the absence of a CASK Interaction Domain and consequently, may have evolved undiscovered structural and functional distinctions.

Results

We demonstrate that the crystal structure of the Sterile Alpha Motif (SAM) domain tandem (SAM1-SAM2) oligomer from CASKIN2 is different than CASKIN1, with the minimal repeating unit being a dimer, rather than a monomer. Analytical ultracentrifugation sedimentation velocity methods revealed differences in monomer/dimer equilibria across a range of concentrations and ionic strengths for the wild type CASKIN2 SAM tandem and a structure-directed double mutant that could not oligomerize. Further distinguishing CASKIN2 from CASKIN1, EGFP-tagged SAM tandem proteins expressed in Neuro2a cells produced punctae that were distinct both in shape and size.

Conclusions

This study illustrates a new way in which neuronal SAM domains can assemble into large macromolecular assemblies that might concentrate and amplify synaptic responses.
  相似文献   

13.

Introduction

Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.

Objectives

Merge in the same platform the steps required for metabolomics data processing.

Methods

KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.

Results

The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.

Conclusion

KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.
  相似文献   

14.

Introduction

It is difficult to elucidate the metabolic and regulatory factors causing lipidome perturbations.

Objectives

This work simplifies this process.

Methods

A method has been developed to query an online holistic lipid metabolic network (of 7923 metabolites) to extract the pathways that connect the input list of lipids.

Results

The output enables pathway visualisation and the querying of other databases to identify potential regulators. When used to a study a plasma lipidome dataset of polycystic ovary syndrome, 14 enzymes were identified, of which 3 are linked to ELAVL1—an mRNA stabiliser.

Conclusion

This method provides a simplified approach to identifying potential regulators causing lipid-profile perturbations.
  相似文献   

15.

Introduction

Untargeted metabolomics is a powerful tool for biological discoveries. To analyze the complex raw data, significant advances in computational approaches have been made, yet it is not clear how exhaustive and reliable the data analysis results are.

Objectives

Assessment of the quality of raw data processing in untargeted metabolomics.

Methods

Five published untargeted metabolomics studies, were reanalyzed.

Results

Omissions of at least 50 relevant compounds from the original results as well as examples of representative mistakes were reported for each study.

Conclusion

Incomplete raw data processing shows unexplored potential of current and legacy data.
  相似文献   

16.

Introduction

Adoption of automatic profiling tools for 1H-NMR-based metabolomic studies still lags behind other approaches in the absence of the flexibility and interactivity necessary to adapt to the properties of study data sets of complex matrices.

Objectives

To provide an open source tool that fully integrates these needs and enables the reproducibility of the profiling process.

Methods

rDolphin incorporates novel techniques to optimize exploratory analysis, metabolite identification, and validation of profiling output quality.

Results

The information and quality achieved in two public datasets of complex matrices are maximized.

Conclusion

rDolphin is an open-source R package (http://github.com/danielcanueto/rDolphin) able to provide the best balance between accuracy, reproducibility and ease of use.
  相似文献   

17.

Background

Measurement-unit conflicts are a perennial problem in integrative research domains such as clinical meta-analysis. As multi-national collaborations grow, as new measurement instruments appear, and as Linked Open Data infrastructures become increasingly pervasive, the number of such conflicts will similarly increase.

Methods

We propose a generic approach to the problem of (a) encoding measurement units in datasets in a machine-readable manner, (b) detecting when a dataset contained mixtures of measurement units, and (c) automatically converting any conflicting units into a desired unit, as defined for a given study.

Results

We utilized existing ontologies and standards for scientific data representation, measurement unit definition, and data manipulation to build a simple and flexible Semantic Web Service-based approach to measurement-unit harmonization. A cardiovascular patient cohort in which clinical measurements were recorded in a number of different units (e.g., mmHg and cmHg for blood pressure) was automatically classified into a number of clinical phenotypes, semantically defined using different measurement units.

Conclusions

We demonstrate that through a combination of semantic standards and frameworks, unit integration problems can be automatically detected and resolved.
  相似文献   

18.

Background

Cancer associated fibroblasts are activated in the tumor microenvironment and contribute to tumor progression, angiogenesis, extracellular matrix remodeling, and inflammation.

Methods

To identify proteins characteristic for fibroblasts in colorectal cancer we used liquid chromatography-tandem mass spectrometry to derive protein abundance from whole-tissue homogenates of human colorectal cancer/normal mucosa pairs. Alterations of protein levels were determined by two-sided t test with greater than threefold difference and an FDR of < 0.05. Public available datasets were used to predict proteins of stromal origin and link protein with mRNA regulation. Immunohistochemistry confirmed the localization of selected proteins.

Results

We identified a set of 24 proteins associated with inflammation, matrix organization, TGFβ receptor signaling and angiogenesis mainly originating from the stroma. Most prominent were increased abundance of SerpinB5 in the parenchyme and latent transforming growth factor β-binding protein, thrombospondin-B2, and secreted protein acidic-and-cysteine-rich in the stroma. Extracellular matrix remodeling involved collagens type VIII, XII, XIV, and VI as well as lysyl-oxidase-2. In silico analysis of mRNA levels demonstrated altered expression in the tumor and the adjacent normal tissue as compared to mucosa of healthy individuals indicating that inflammatory activation affected the surrounding tissue. Immunohistochemistry of 26 tumor specimen confirmed upregulation of SerpinB5, thrombospondin B2 and secreted protein acidic-and-cysteine-rich.

Conclusions

This study demonstrates the feasibility of detecting tumor- and compartment-specific protein-signatures that are functionally meaningful by proteomic profiling of whole-tissue extracts together with mining of RNA expression datasets. The results provide the basis for further exploration of inflammation-related stromal markers in larger patient cohorts and experimental models.
  相似文献   

19.

Background

Taxonomic profiling of microbial communities is often performed using small subunit ribosomal RNA (SSU) amplicon sequencing (16S or 18S), while environmental shotgun sequencing is often focused on functional analysis. Large shotgun datasets contain a significant number of SSU sequences and these can be exploited to perform an unbiased SSU--based taxonomic analysis.

Results

Here we present a new program called RiboTagger that identifies and extracts taxonomically informative ribotags located in a specified variable region of the SSU gene in a high-throughput fashion.

Conclusions

RiboTagger permits fast recovery of SSU-RNA sequences from shotgun nucleic acid surveys of complex microbial communities. The program targets all three domains of life, exhibits high sensitivity and specificity and is substantially faster than comparable programs.
  相似文献   

20.

Introduction

The generic metabolomics data processing workflow is constructed with a serial set of processes including peak picking, quality assurance, normalisation, missing value imputation, transformation and scaling. The combination of these processes should present the experimental data in an appropriate structure so to identify the biological changes in a valid and robust manner.

Objectives

Currently, different researchers apply different data processing methods and no assessment of the permutations applied to UHPLC-MS datasets has been published. Here we wish to define the most appropriate data processing workflow.

Methods

We assess the influence of normalisation, missing value imputation, transformation and scaling methods on univariate and multivariate analysis of UHPLC-MS datasets acquired for different mammalian samples.

Results

Our studies have shown that once data are filtered, missing values are not correlated with m/z, retention time or response. Following an exhaustive evaluation, we recommend PQN normalisation with no missing value imputation and no transformation or scaling for univariate analysis. For PCA we recommend applying PQN normalisation with Random Forest missing value imputation, glog transformation and no scaling method. For PLS-DA we recommend PQN normalisation, KNN as the missing value imputation method, generalised logarithm transformation and no scaling. These recommendations are based on searching for the biologically important metabolite features independent of their measured abundance.

Conclusion

The appropriate choice of normalisation, missing value imputation, transformation and scaling methods differs depending on the data analysis method and the choice of method is essential to maximise the biological derivations from UHPLC-MS datasets.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号