首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
李高磊  黄玮  孙浩  李余动 《微生物学报》2021,61(9):2581-2593
随着大数据时代的到来,如何将生物组学海量数据转化为易理解及可视化的知识是当前生物信息学面临的重要挑战之一。为了处理复杂、高维的微生物组数据,目前机器学习算法已被应用于人体微生物组研究,以揭示疾病背后的复杂机制。本文首先简述了微生物组数据处理方法及常用的机器学习算法,如支持向量机(SVM)、随机森林(RF)和人工神经网络(ANN)等,然后对机器学习的工作流程及其要点进行阐述,并探讨了机器学习算法在基于微生物组数据预测宿主表型方面的应用。最后以唾液微生物组数据预测口腔异味为例,实现了机器学习算法的模型构建与评估分析,并提供了可用于微生物组研究实践的R/Python代码(https://github.com/LiLabZSU/microbioML)。  相似文献   

2.
Scientific research is shedding light on the interaction of the gut microbiome with the human host and on its role in human health. Existing machine learning methods have shown great potential in discriminating healthy from diseased microbiome states. Most of them leverage shotgun metagenomic sequencing to extract gut microbial species-relative abundances or strain-level markers. Each of these gut microbial profiling modalities showed diagnostic potential when tested separately; however, no existing approach combines them in a single predictive framework. Here, we propose the Multimodal Variational Information Bottleneck (MVIB), a novel deep learning model capable of learning a joint representation of multiple heterogeneous data modalities. MVIB achieves competitive classification performance while being faster than existing methods. Additionally, MVIB offers interpretable results. Our model adopts an information theoretic interpretation of deep neural networks and computes a joint stochastic encoding of different input data modalities. We use MVIB to predict whether human hosts are affected by a certain disease by jointly analysing gut microbial species-relative abundances and strain-level markers. MVIB is evaluated on human gut metagenomic samples from 11 publicly available disease cohorts covering 6 different diseases. We achieve high performance (0.80 < ROC AUC < 0.95) on 5 cohorts and at least medium performance on the remaining ones. We adopt a saliency technique to interpret the output of MVIB and identify the most relevant microbial species and strain-level markers to the model’s predictions. We also perform cross-study generalisation experiments, where we train and test MVIB on different cohorts of the same disease, and overall we achieve comparable results to the baseline approach, i.e. the Random Forest. Further, we evaluate our model by adding metabolomic data derived from mass spectrometry as a third input modality. Our method is scalable with respect to input data modalities and has an average training time of < 1.4 seconds. The source code and the datasets used in this work are publicly available.  相似文献   

3.
环境微生物研究中机器学习算法及应用   总被引:1,自引:0,他引:1  
陈鹤  陶晔  毛振镀  邢鹏 《微生物学报》2022,62(12):4646-4662
微生物在环境中无处不在,它们不仅是生物地球化学循环和环境演化的关键参与者,也在环境监测、生态治理和保护中发挥着重要作用。随着高通量技术的发展,大量微生物数据产生,运用机器学习对环境微生物大数据进行建模和分析,在微生物标志物识别、污染物预测和环境质量预测等领域的科学研究和社会应用方面均具有重要意义。机器学习可分为监督学习和无监督学习2大类。在微生物组学研究当中,无监督学习通过聚类、降维等方法高效地学习输入数据的特征,进而对微生物数据进行整合和归类。监督学习运用有特征和标记的微生物数据集训练模型,在面对只有特征没有标记的数据时可以判断出标记,从而实现对新数据的分类、识别和预测。然而,复杂的机器学习算法通常以牺牲可解释性为代价来重点关注模型预测的准确性。机器学习模型通常可以看作预测特定结果的“黑匣子”,即对模型如何得出预测所知甚少。为了将机器学习更多地运用于微生物组学研究、提高我们提取有价值的微生物信息的能力,深入了解机器学习算法、提高模型的可解释性尤为重要。本文主要介绍在环境微生物领域常用的机器学习算法和基于微生物组数据的机器学习模型的构建步骤,包括特征选择、算法选择、模型构建和评估等,并对各种机器学习模型在环境微生物领域的应用进行综述,深入探究微生物组与周围环境之间的关联,探讨提高模型可解释性的方法,并为未来环境监测、环境健康预测提供科学参考。  相似文献   

4.
核心微生物组的研究及利用现状   总被引:1,自引:0,他引:1  
随着分子生物学和生物信息学的飞速发展,新一代测序技术可以轻松地检测不同样本中复杂的微生物分类单元。面对这些复杂而大量的微生物组数据带来的分析挑战,利用核心微生物组的方法来描述和分析样本中的核心微生物组和关键种是近年来新的研究热点,这些结果将揭示与宿主健康、生长和生产等密切相关的微生物种类,有助于深入认识微生物与宿主间的相互关系,深刻理解微生物对宿主的影响作用,更好地理解微生物组在自然生态系统中的功能。本文阐述了核心微生物组的定义、研究方法、与动植物的关系等方面的研究及利用现状,为更好地利用核心微生物组解决环境、人类健康和农业生产问题提供思路。  相似文献   

5.
《IRBM》2022,43(5):333-339
1) ObjectivesPreterm birth caused by preterm labor is one of the major health problems in the world. In this article, we present a new framework for dealing with this problem through the processing of electrohysterographic signals (EHG) that are recorded during labor and pregnancy. The objective in this research is to improve the classification between labor and pregnancy contractions by using a new approach that focuses on the connectivity analysis based on graph parameters, representative of uterine synchronization, and comparing neural network and machine learning methods in order to classify between labor and pregnancy.2) Material and methodsafter denoising of the 16 EHG signals recorded from pregnant women abdomen, we applied different connectivity methods to obtain connectivity matrices; then by using the graph theory, we extracted some graph parameters from the connectivity matrices; finally, we tested different neural network and machine learning methods on the features obtained from both graph and connectivity methods in order to classify between labor and pregnancy.3) ResultsThe best results were obtained by using the logistic regression method. We also evidence the power of graph parameters extracted from the connectivity matrices to improve the classification results.4) ConclusionThe use of graph analysis associated with machine learning methods can be a powerful tool to improve labor and pregnancy classification based on the analysis of EHG signals.  相似文献   

6.
Emerging evidence suggests that host-microbe interaction in the cervicovaginal microenvironment contributes to cervical carcinogenesis, yet dissecting these complex interactions is challenging. Herein, we performed an integrated analysis of multiple “omics” datasets to develop predictive models of the cervicovaginal microenvironment and identify characteristic features of vaginal microbiome, genital inflammation and disease status. Microbiomes, vaginal pH, immunoproteomes and metabolomes were measured in cervicovaginal specimens collected from a cohort (n = 72) of Arizonan women with or without cervical neoplasm. Multi-omics integration methods, including neural networks (mmvec) and Random Forest supervised learning, were utilized to explore potential interactions and develop predictive models. Our integrated analyses revealed that immune and cancer biomarker concentrations were reliably predicted by Random Forest regressors trained on microbial and metabolic features, suggesting close correspondence between the vaginal microbiome, metabolome, and genital inflammation involved in cervical carcinogenesis. Furthermore, we show that features of the microbiome and host microenvironment, including metabolites, microbial taxa, and immune biomarkers are predictive of genital inflammation status, but only weakly to moderately predictive of cervical neoplastic disease status. Different feature classes were important for prediction of different phenotypes. Lipids (e.g. sphingolipids and long-chain unsaturated fatty acids) were strong predictors of genital inflammation, whereas predictions of vaginal microbiota and vaginal pH relied mostly on alterations in amino acid metabolism. Finally, we identified key immune biomarkers associated with the vaginal microbiota composition and vaginal pH (MIF), as well as genital inflammation (IL-6, IL-10, MIP-1α).  相似文献   

7.
Anthropogenic environments such as those created by intensive farming of livestock, have been proposed to provide ideal selection pressure for the emergence of antimicrobial-resistant Escherichia coli bacteria and antimicrobial resistance genes (ARGs) and spread to humans. Here, we performed a longitudinal study in a large-scale commercial poultry farm in China, collecting E. coli isolates from both farm and slaughterhouse; targeting animals, carcasses, workers and their households and environment. By using whole-genome phylogenetic analysis and network analysis based on single nucleotide polymorphisms (SNPs), we found highly interrelated non-pathogenic and pathogenic E. coli strains with phylogenetic intermixing, and a high prevalence of shared multidrug resistance profiles amongst livestock, human and environment. Through an original data processing pipeline which combines omics, machine learning, gene sharing network and mobile genetic elements analysis, we investigated the resistance to 26 different antimicrobials and identified 361 genes associated to antimicrobial resistance (AMR) phenotypes; 58 of these were known AMR-associated genes and 35 were associated to multidrug resistance. We uncovered an extensive network of genes, correlated to AMR phenotypes, shared among livestock, humans, farm and slaughterhouse environments. We also found several human, livestock and environmental isolates sharing closely related mobile genetic elements carrying ARGs across host species and environments. In a scenario where no consensus exists on how antibiotic use in the livestock may affect antibiotic resistance in the human population, our findings provide novel insights into the broader epidemiology of antimicrobial resistance in livestock farming. Moreover, our original data analysis method has the potential to uncover AMR transmission pathways when applied to the study of other pathogens active in other anthropogenic environments characterised by complex interconnections between host species.  相似文献   

8.
Many microbes are important symbiotes of human. They form specific microbiota communities, participate in various kinds of biological processes of their host and thus deeply affect human health status. Metagenomic sequencing has been widely used in human microbiota study due to its capacity of studying all genetic materials in an environment as a whole without any extra need of isolation or cultivation of microorganisms. Many efforts have been made by researchers in this area trying to dig out interesting knowledge from various metagenome data. In this review, we go through some prominent studies in the metagenomic area. We summarize them into three categories, constructing taxonomy and gene reference, characterization of microbiome distribution patterns, and detection of microbiome alternations associated with specific human phenotypes or diseases. Some available data resources are also provided. This review can serve as an entrance to this exciting and rapidly developing field for researchers interested in human microbiomes.  相似文献   

9.
Under the network environment, the trading volume and asset price of a financial commodity or instrument are affected by various complicated factors. Machine learning and sentiment analysis provide powerful tools to collect a great deal of data from the website and retrieve useful information for effectively forecasting financial risk of associated companies. This article studies trading volume and asset price risk when sentimental financial information data are available using both sentiment analysis and popular machine learning approaches: artificial neural network (ANN) and support vector machine (SVM). Nonlinear GARCH-based mining models are developed by integrating GARCH (generalized autoregressive conditional heteroskedasticity) theory and ANN and SVM. Empirical studies in the U.S. stock market show that the proposed approach achieves favorable forecast performances. GARCH-based SVM outperforms GARCH-based ANN for volatility forecast, whereas GARCH-based ANN achieves a better forecast result for the volatility trend. Results also indicate a strong correlation between information sentiment and both trading volume and asset price volatility.  相似文献   

10.
基于机器学习的肠道菌群数据建模与分析研究综述   总被引:1,自引:0,他引:1  
人体肠道菌群与人类的健康和疾病存在密切关系,对肠道菌群的宏基因组数据进行建模和分析,在疾病预测及诊断相关领域科学研究和社会应用方面均具有重要意义。本文从大数据分析和机器学习的角度,对人体肠道菌群数据的建模、分析和预测算法的原理、过程以及典型研究应用实例进行综述,以期推动肠道菌群分析相关研究发展以及探索结合机器学习算法进行肠道菌群分析的有效方式,同时也为开发基于肠道菌群数据的新型诊疗手段提供借鉴,推动我国精准医疗事业发展。  相似文献   

11.
To predict rice blast, many machine learning methods have been proposed. As the quality and quantity of input data are essential for machine learning techniques, this study develops three artificial neural network (ANN)-based rice blast prediction models by combining two ANN models, the feed-forward neural network (FFNN) and long short-term memory (LSTM), with diverse input datasets, and compares their performance. The Blast_Weather_FFNN model had the highest recall score (66.3%) for rice blast prediction. This model requires two types of input data: blast occurrence data for the last 3 years and weather data (daily maximum temperature, relative humidity, and precipitation) between January and July of the prediction year. This study showed that the performance of an ANN-based disease prediction model was improved by applying suitable machine learning techniques together with the optimization of hyperparameter tuning involving input data. Moreover, we highlight the importance of the systematic collection of long-term disease data.  相似文献   

12.
Artificial neural networks, taking inspiration from biological neurons, have become an invaluable tool for machine learning applications. Recent studies have developed techniques to effectively tune the connectivity of sparsely-connected artificial neural networks, which have the potential to be more computationally efficient than their fully-connected counterparts and more closely resemble the architectures of biological systems. We here present a normalisation, based on the biophysical behaviour of neuronal dendrites receiving distributed synaptic inputs, that divides the weight of an artificial neuron’s afferent contacts by their number. We apply this dendritic normalisation to various sparsely-connected feedforward network architectures, as well as simple recurrent and self-organised networks with spatially extended units. The learning performance is significantly increased, providing an improvement over other widely-used normalisations in sparse networks. The results are two-fold, being both a practical advance in machine learning and an insight into how the structure of neuronal dendritic arbours may contribute to computation.  相似文献   

13.
《遗传学报》2021,48(11):972-983
Understanding the micro-coevolution of the human gut microbiome with host genetics is challenging but essential in both evolutionary and medical studies. To gain insight into the interactions between host genetic variation and the gut microbiome, we analyzed both the human genome and gut microbiome collected from a cohort of 190 students in the same boarding college and representing 3 ethnic groups, Uyghur, Kazakh, and Han Chinese. We found that differences in gut microbiome were greater between genetically distinct ethnic groups than those genetically closely related ones in taxonomic composition, functional composition, enterotype stratification, and microbiome genetic differentiation. We also observed considerable correlations between host genetic variants and the abundance of a subset of gut microbial species. Notably, interactions between gut microbiome species and host genetic variants might have coordinated effects on specific human phenotypes. Bacteroides ovatus, previously reported to modulate intestinal immunity, is significantly correlated with the host genetic variant rs12899811 (meta-P = 5.55 × 10−5), which regulates the VPS33B expression in the colon, acting as a tumor suppressor of colorectal cancer. These results advance our understanding of the micro-coevolution of the human gut microbiome and their interactive effects with host genetic variation on phenotypic diversity.  相似文献   

14.
Host-symbiont dynamics are known to influence host phenotype, but their role in social behavior has yet to be investigated. Variation in life history across honey bee (Apis mellifera) castes may influence community composition of gut symbionts, which may in turn influence caste phenotypes. We investigated the relationship between host-symbiont dynamics and social behavior by characterizing the hindgut microbiome among distinct honey bee castes: queens, males and two types of workers, nurses and foragers. Despite a shared hive environment and mouth-to-mouth food transfer among nestmates, we detected separation among gut microbiomes of queens, workers, and males. Gut microbiomes of nurses and foragers were similar to previously characterized honey bee worker microbiomes and to each other, despite differences in diet, activity, and exposure to the external environment. Queen microbiomes were enriched for bacteria that may enhance metabolic conversion of energy from food to egg production. We propose that the two types of workers, which have the highest diversity of operational taxonomic units (OTUs) of bacteria, are central to the maintenance of the colony microbiome. Foragers may introduce new strains of bacteria to the colony from the environment and transfer them to nurses, who filter and distribute them to the rest of the colony. Our results support the idea that host-symbiont dynamics influence microbiome composition and, reciprocally, host social behavior.  相似文献   

15.
微生物在生态系统中起着重要作用。最近的研究表明,微生物群落具有核心组(分类单元),这些类群对宿主的健康、生长和生产有着重要的影响。基于MetaCoMET与共存网络两种方法对采自湖南、四川和贵州的药用杜仲树皮真菌群落进行了核心真菌组分析。MetaCoMET结果显示,在OTU水平上,核心真菌组共有16个分类单元,优势菌是丛赤壳科一未定真菌,其次为Fusarium pseudensiforme、一种黄丝菌Cephalothecaceae sp.和一种镰刀菌Fusarium sp.等。共存网络分析揭示了11个中枢真菌分类单元。虽然两种方法的分析结果不完全吻合,但在11个中枢真菌上具有较好的一致性。整体而言,特定核心真菌组具有一定的稳定性。研究结果为进一步揭示植物微生物组的功能提供支撑。  相似文献   

16.
Development introduces structured correlations among traits that may constrain or bias the distribution of phenotypes produced. Moreover, when suitable heritable variation exists, natural selection may alter such constraints and correlations, affecting the phenotypic variation available to subsequent selection. However, exactly how the distribution of phenotypes produced by complex developmental systems can be shaped by past selective environments is poorly understood. Here we investigate the evolution of a network of recurrent nonlinear ontogenetic interactions, such as a gene regulation network, in various selective scenarios. We find that evolved networks of this type can exhibit several phenomena that are familiar in cognitive learning systems. These include formation of a distributed associative memory that can “store” and “recall” multiple phenotypes that have been selected in the past, recreate complete adult phenotypic patterns accurately from partial or corrupted embryonic phenotypes, and “generalize” (by exploiting evolved developmental modules) to produce new combinations of phenotypic features. We show that these surprising behaviors follow from an equivalence between the action of natural selection on phenotypic correlations and associative learning, well‐understood in the context of neural networks. This helps to explain how development facilitates the evolution of high‐fitness phenotypes and how this ability changes over evolutionary time.  相似文献   

17.
To date, most insights into the processes shaping vertebrate gut microbiomes have emerged from studies with cross‐sectional designs. While this approach has been valuable, emerging time series analyses on vertebrate gut microbiomes show that gut microbial composition can change rapidly from 1 day to the next, with consequences for host physical functioning, health, and fitness. Hence, the next frontier of microbiome research will require longitudinal perspectives. Here we argue that primatologists, with their traditional focus on tracking the lives of individual animals and familiarity with longitudinal fecal sampling, are well positioned to conduct research at the forefront of gut microbiome dynamics. We begin by reviewing some of the most important ecological processes governing microbiome change over time, and briefly summarizing statistical challenges and approaches to microbiome time series analysis. We then introduce five questions of general interest to microbiome science where we think field‐based primate studies are especially well positioned to fill major gaps: (a) Do early life events shape gut microbiome composition in adulthood? (b) Do shifting social landscapes cause gut microbial change? (c) Are gut microbiome phenotypes heritable across variable environments? (d) Does the gut microbiome show signs of host aging? And (e) do gut microbiome composition and dynamics predict host health and fitness? For all of these questions, we highlight areas where primatologists are uniquely positioned to make substantial contributions. We review preliminary evidence, discuss possible study designs, and suggest future directions.  相似文献   

18.
Advances in high-throughput sequencing(HTS)have fostered rapid developments in the field of microbiome research,and massive microbiome datasets are now being generated.However,the diversity of software tools and the complexity of analysis pipelines make it difficult to access this field.Here,we systematically summarize the advantages and limitations of micro-biome methods.Then,we recommend specific pipelines for amplicon and metagenomic analyses,and describe commonly-used software and databases,to help researchers select the appropriate tools.Furthermore,we introduce statistical and visualization methods suit-able for microbiome analysis,including alpha-and beta-diversity,taxonomic composition,difference compar-isons,correlation,networks,machine learning,evolu-tion,source tracing,and common visualization styles to help researchers make informed choices.Finally,a step-by-step reproducible analysis guide is introduced.We hope this review will allow researchers to carry out data analysis more effectively and to quickly select the appropriate tools in order to efficiently mine the bio-logical significance behind the data.  相似文献   

19.
Nervous systems extract and process information from the environment to alter animal behavior and physiology. Despite progress in understanding how different stimuli are represented by changes in neuronal activity, less is known about how they affect broader neural network properties. We developed a framework for using graph-theoretic features of neural network activity to predict ecologically relevant stimulus properties, in particular stimulus identity. We used the transparent nematode, Caenorhabditis elegans, with its small nervous system to define neural network features associated with various chemosensory stimuli. We first immobilized animals using a microfluidic device and exposed their noses to chemical stimuli while monitoring changes in neural activity of more than 50 neurons in the head region. We found that graph-theoretic features, which capture patterns of interactions between neurons, are modulated by stimulus identity. Further, we show that a simple machine learning classifier trained using graph-theoretic features alone, or in combination with neural activity features, can accurately predict salt stimulus. Moreover, by focusing on putative causal interactions between neurons, the graph-theoretic features were almost twice as predictive as the neural activity features. These results reveal that stimulus identity modulates the broad, network-level organization of the nervous system, and that graph theory can be used to characterize these changes.  相似文献   

20.
Post-translational modifications (PTMs) play an essential role in most biological processes. PTMs on human proteins have been extensively studied. Studies on bacterial PTMs are emerging, which demonstrate that bacterial PTMs are different from human PTMs in their types, mechanisms and functions. Few PTM studies have been done on the microbiome. Here, we reviewed several studied PTMs in bacteria including phosphorylation, acetylation, succinylation, glycosylation, and proteases. We discussed the enzymes responsible for each PTM and their functions. We also summarized the current methods used to study microbiome PTMs and the observations demonstrating the roles of PTM in the microbe-microbe interactions within the microbiome and their interactions with the environment or host. Although new methods and tools for PTM studies are still needed, the existing technologies have made great progress enabling a deeper understanding of the functional regulation of the microbiome. Large-scale application of these microbiome-wide PTM studies will provide a better understanding of the microbiome and its roles in the development of human diseases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号