首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Researches on the microbiome have been actively conducted worldwide and the results have shown human gut bacterial environment significantly impacts on immune system, psychological conditions, cancers, obesity, and metabolic diseases. Thanks to the development of sequencing technology, microbiome studies with large number of samples are eligible on an acceptable cost nowadays. Large samples allow analysis of more sophisticated modeling using machine learning approaches to study relationships between microbiome and various traits. This article provides an overview of machine learning methods for non-data scientists interested in the association analysis of microbiomes and host phenotypes. Once genomic feature of microbiome is determined, various analysis methods can be used to explore the relationship between microbiome and host phenotypes that include penalized regression, support vector machine (SVM), random forest, and artificial neural network (ANN). Deep neural network methods are also touched. Analysis procedure from environment setup to extract analysis results are presented with Python programming language.  相似文献   

2.
The search for predictive biomarkers of disease from high-throughput mass spectrometry (MS) data requires a complex analysis path. Preprocessing and machine-learning modules are pipelined, starting from raw spectra, to set up a predictive classifier based on a shortlist of candidate features. As a machine-learning problem, proteomic profiling on MS data needs caution like the microarray case. The risk of overfitting and of selection bias effects is pervasive: not only potential features easily outnumber samples by 10(3) times, but it is easy to neglect information-leakage effects during preprocessing from spectra to peaks. The aim of this review is to explain how to build a general purpose design analysis protocol (DAP) for predictive proteomic profiling: we show how to limit leakage due to parameter tuning and how to organize classification and ranking on large numbers of replicate versions of the original data to avoid selection bias. The DAP can be used with alternative components, i.e. with different preprocessing methods (peak clustering or wavelet based), classifiers e.g. Support Vector Machine (SVM) or feature ranking methods (recursive feature elimination or I-Relief). A procedure for assessing stability and predictive value of the resulting biomarkers' list is also provided. The approach is exemplified with experiments on synthetic datasets (from the Cromwell MS simulator) and with publicly available datasets from cancer studies.  相似文献   

3.
4.

Background  

A key challenge in systems biology is the reconstruction of an organism's metabolic network from its genome sequence. One strategy for addressing this problem is to predict which metabolic pathways, from a reference database of known pathways, are present in the organism, based on the annotated genome of the organism.  相似文献   

5.
A complete prototype for the automatic detection of normal examinations on a teleophthalmology network for diabetic retinopathy screening is presented. The system combines pathological pattern mining methods, with specific lesion detection methods, to extract information from the images. This information, plus patient and other contextual data, is used by a classifier to compute an abnormality risk. Such a system should reduce the burden on readers on teleophthalmology networks.  相似文献   

6.
Increasingly biologists and ecologists are becoming aware of the vital importance of soil to processes observed aboveground and are incorporating soil analyses into their research. Because of the dynamic and heterogeneous nature of soil, proper incorporation of soil analysis into ecological studies requires knowledge and planning. Unfortunately, many ecologists may not be current (or trained at all) in soil science. We provide this review, based on our cumulative >60 years of work in soil science, to help familiarize researchers with essential information to appropriately incorporate soil analyses into ecological studies. Specifically, we provide a brief introduction into soils and then discuss issues related to soil sterilization, choosing a soil for a greenhouse project, sampling soils, and soil analyses.  相似文献   

7.
8.

Background  

Current efforts in Metabolomics, such as the Human Metabolome Project, collect structures of biological metabolites as well as data for their characterisation, such as spectra for identification of substances and measurements of their concentration. Still, only a fraction of existing metabolites and their spectral fingerprints are known. Computer-Assisted Structure Elucidation (CASE) of biological metabolites will be an important tool to leverage this lack of knowledge. Indispensable for CASE are modules to predict spectra for hypothetical structures. This paper evaluates different statistical and machine learning methods to perform predictions of proton NMR spectra based on data from our open database NMRShiftDB.  相似文献   

9.
Tropical biodiversity continues to erode unabated, which calls for ecologists to address the problem directly, placing less reliance on indirect interventions, such as community-based development schemes. Ecologists must become more assertive in providing scientifically formulated and adaptively managed interventions, involving biodiversity payments, to serve local, regional and global interests in tropical nature. Priorities for tropical ecologists thus include the identification of key thresholds to ecological resilience, and the formulation of clear monitoring protocols and management strategies for implementation by local resource managers. A particular challenge is to demonstrate how nature reserves contribute to the adaptive capacity of regional land-use matrices and, hence, to the provision of sustainable benefits at multiple spatial and temporal scales.  相似文献   

10.
Bioinformatics is the field where computational methods from various domains have come together for analysis of biological data. Each domain has introduced its own specific jargon. However, in closely related domains, e.g. machine learning and statistics, concordant and discordant terminology occurs, the later can lead to confusion. This article aims to help solve the confusion of tongues arising from these two closely related domains, which are frequently used in bioinformatics. We provide a short summary of the most commonly applied machine learning and statistical approaches to data analysis in bioinformatics, i.e. classification and statistical hypothesis testing. We explain differences and similarities in common terminology used in various domains, such as precision, recall, sensitivity and true positive rate. This primer can serve as a guide to the terminology used in these fields.  相似文献   

11.
Studies on the song learning in birds revealed a puzzling property of the acquisition system: Stimulus memorization becomes effective after remarkably few exposures, but nevertheless shows a relationship to the frequency of exposure to learning stimuli. This raises questions on the amount of learning that will occur during a given exposure to song. To examine this issue, we tutored handraised fledgling nightingales (Luscinia megarhynchos) with song strings, in which the serial succession of species-typical master songs was altered upon subsequent exposures. The sequencing of imitations obtained from the birds' adult singing revealed the following results: (1) A single exposure was sufficient for subjects to acquire serial information on song-type sequencing. (2) The first exposure to a master string played a key role for this accomplishment. (3) Nevertheless, the acquisition of serial information improved with increasing exposure frequency of master strings. (4) The acquisition of song patterns was not impaired by a non-regular presentation of master song-types. With respect to the particular salience of the first exposure for sequence memorization, we termed the phenomenon primer effect. The findings suggest that stimulus acquisition during perceptual song learning is mediated by a discontinuous process. Once acquired, information is then consolidated gradually, i.e. through an incremental process.  相似文献   

12.
Reactive oxygen species (ROS)-induced damage on host cells and molecules has been considered the most likely proximal mechanism responsible for the age-related decline in organismal performance. Organisms have two possible ways to reduce the negative effect of ROS: disposing of effective antioxidant defenses and minimizing ROS production. The unbalance between the amount of ROS produced and the availability of antioxidant defenses determines the intensity of so-called oxidative stress. Interestingly, most studies that deal with the effect of oxidative stress on organismal performance have focused on the antioxidant defense compartment and, surprisingly, have neglected the mechanisms that control ROS production within mitochondria. Uncoupling proteins (UCPs), mitochondrial transporters of the inner membrane, are involved in the control of redox state of cells and in the production of mitochondrial ROS. Given their function, UCPs might therefore represent a major mechanistic link between metabolic activity and fitness. We suggest that by exploring the role of expression and function of UCPs both in experimental as well as in comparative studies, evolutionary biologists may gain better insight into this link.  相似文献   

13.
14.
15.
Joshua Ladau  Sadie J. Ryan 《Oikos》2010,119(7):1064-1069
Null model tests of presence–absence data (‘NMTPAs’) provide important tools for inferring effects of competition, facilitation, habitat filtering, and other ecological processes from observational data. Many NMTPAs have been developed, but they often yield conflicting conclusions when applied to the same data. Type I and II error rates, size, power, robustness and bias provide important criteria for assessing which tests are valid, but these criteria need to be evaluated contingent on the sample size, null hypothesis of interest, and assumptions that are appropriate for the data set that is being analyzed. In this paper, we confirm that this is the case using the software MPower, evaluating the validity of NMTPAs contingent on the null hypothesis being tested, assumptions that can be made, and sample size. Evaluating the validity of NMTPAs contingent on these factors is important towards ensuring that reliable inferences are drawn from observational data about the processes controlling community assembly.  相似文献   

16.
“Intellectual property” (IP) is a generic legal term for patents, copyrights, and trademarks, which provide legal rights to protect ideas, the expression of ideas, and the inventors and creators of such ideas. A patent provides legal protection for a new invention, an application of a new idea, discovery, or concept that is useful. Copyright provides legal protection from copying for any creative work, as well as business and scientific publications, computer software, and compilations of information. A trademark provides rights to use symbols, particular words, logos, or other markings that indicate the source of a product or service. A further method of benefiting from an invention is simply to keep it secret, rather than to disclose it—a “trade secret.” IP impinges on almost everything scientists do. As scientists are paid to come up with ideas and aspire to patent and/or publish their work, the protection of ideas and of written works especially should be of interest and concern to all.  相似文献   

17.
Many ecological studies use the analysis of count data to arrive at biologically meaningful inferences. Here, we introduce a hierarchical bayesian approach to count data. This approach has the advantage over traditional approaches in that it directly estimates the parameters of interest at both the individual-level and population-level, appropriately models uncertainty, and allows for comparisons among models, including those that exceed the complexity of many traditional approaches, such as ANOVA or non-parametric analogs. As an example, we apply this method to oviposition preference data for butterflies in the genus Lycaeides. Using this method, we estimate the parameters that describe preference for each population, compare the preference hierarchies among populations, and explore various models that group populations that share the same preference hierarchy.  相似文献   

18.
19.
MOTIVATION: We describe APDB, a novel measure for evaluating the quality of a protein sequence alignment, given two or more PDB structures. This evaluation does not require a reference alignment or a structure superposition. APDB is designed to efficiently and objectively benchmark multiple sequence alignment methods. RESULTS: Using existing collections of reference multiple sequence alignments and existing alignment methods, we show that APDB gives results that are consistent with those obtained using conventional evaluations. We also show that APDB is suitable for evaluating sequence alignments that are structurally equivalent. We conclude that APDB provides an alternative to more conventional methods used for benchmarking sequence alignment packages.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号