共查询到20条相似文献,搜索用时 31 毫秒
1.
Stéphanie Muller Pascal Lesage Réjean Samson 《The International Journal of Life Cycle Assessment》2016,21(8):1185-1196
Purpose
Life cycle inventory (LCI) databases provide generic data on exchange values associated with unit processes. The “ecoinvent” LCI database estimates the uncertainty of all exchange values through the application of the so-called pedigree approach. In the first release of the database, the used uncertainty factors were based on experts’ judgments. In 2013, Ciroth et al. derived empirically based factors. These, however, assumed that the same uncertainty factors could be used for all industrial sectors and fell short of providing basic uncertainty factors. The work presented here aims to overcome these limitations.Methods
The proposed methodological framework is based on the assessment of more than 60 data sources (23,200 data points) and the use of Bayesian inference. Using Bayesian inference allows an update of uncertainty factors by systematically combining experts’ judgments and other information we already have about the uncertainty factors with new data.Results and discussion
The implementation of the methodology over the data sources results in the definition of new uncertainty factors for all additional uncertainty indicators and for some specific industrial sectors. It also results in the definition of some basic uncertainty factors. In general, the factors obtained are higher than the ones obtained in previous work, which suggests that the experts had initially underestimated uncertainty. Furthermore, the presented methodology can be applied to update uncertainty factors as new data become available.Conclusions
In practice, these uncertainty factors can systematically be incorporated in LCI databases as estimates of exchange value uncertainty where more formal uncertainty information is not available. The use of Bayesian inference is applied here to update uncertainty factors but can also be used in other life cycle assessment developments in order to improve experts’ judgments or to update parameter values when new data can be accessed.2.
Background
Inference of gene networks from expression data is an important problem in computational biology. Many algorithms have been proposed for solving the problem efficiently. However, many of the available implementations are programming libraries that require users to write code, which limits their accessibility.Results
We have developed a tool called CyNetworkBMA for inferring gene networks from expression data that integrates with Cytoscape. Our application offers a graphical user interface for networkBMA, an efficient implementation of Bayesian Model Averaging methods for network construction. The client-server architecture of CyNetworkBMA makes it possible to distribute or centralize computation depending on user needs.Conclusions
CyNetworkBMA is an easy-to-use tool that makes network inference accessible to non-programmers through seamless integration with Cytoscape. CyNetworkBMA is available on the Cytoscape App Store at http://apps.cytoscape.org/apps/cynetworkbma.3.
Background
One of the recent challenges of computational biology is development of new algorithms, tools and software to facilitate predictive modeling of big data generated by high-throughput technologies in biomedical research.Results
To meet these demands we developed PROPER - a package for visual evaluation of ranking classifiers for biological big data mining studies in the MATLAB environment.Conclusion
PROPER is an efficient tool for optimization and comparison of ranking classifiers, providing over 20 different two- and three-dimensional performance curves.4.
Background
The DNase I hypersensitive sites (DHSs) are associated with the cis-regulatory DNA elements. An efficient method of identifying DHSs can enhance the understanding on the accessibility of chromatin. Despite a multitude of resources available on line including experimental datasets and computational tools, the complex language of DHSs remains incompletely understood.Methods
Here, we address this challenge using an approach based on a state-of-the-art machine learning method. We present a novel convolutional neural network (CNN) which combined Inception like networks with a gating mechanism for the response of multiple patterns and longterm association in DNA sequences to predict multi-scale DHSs in Arabidopsis, rice and Homo sapiens.Results
Our method obtains 0.961 area under curve (AUC) on Arabidopsis, 0.969 AUC on rice and 0.918 AUC on Homo sapiens.Conclusions
Our method provides an efficient and accurate way to identify multi-scale DHSs sequences by deep learning.5.
Introduction
Untargeted metabolomics is a powerful tool for biological discoveries. To analyze the complex raw data, significant advances in computational approaches have been made, yet it is not clear how exhaustive and reliable the data analysis results are.Objectives
Assessment of the quality of raw data processing in untargeted metabolomics.Methods
Five published untargeted metabolomics studies, were reanalyzed.Results
Omissions of at least 50 relevant compounds from the original results as well as examples of representative mistakes were reported for each study.Conclusion
Incomplete raw data processing shows unexplored potential of current and legacy data.6.
Edward Meeds Michael Chiang Mary Lee Olivier Cinquin John Lowengrub Max Welling 《BMC bioinformatics》2015,16(1):264
Background
In many domains, scientists build complex simulators of natural phenomena that encode their hypotheses about the underlying processes. These simulators can be deterministic or stochastic, fast or slow, constrained or unconstrained, and so on. Optimizing the simulators with respect to a set of parameter values is common practice, resulting in a single parameter setting that minimizes an objective subject to constraints.Results
We propose algorithms for post optimization posterior evaluation (POPE) of simulators. The algorithms compute and visualize all simulations that can generate results of the same or better quality than the optimum, subject to constraints. These optimization posteriors are desirable for a number of reasons among which are easy interpretability, automatic parameter sensitivity and correlation analysis, and posterior predictive analysis. Our algorithms are simple extensions to an existing simulation-based inference framework called approximate Bayesian computation. POPE is applied two biological simulators: a fast and stochastic simulator of stem-cell cycling and a slow and deterministic simulator of tumor growth patterns.Conclusions
POPE allows the scientist to explore and understand the role that constraints, both on the input and the output, have on the optimization posterior. As a Bayesian inference procedure, POPE provides a rigorous framework for the analysis of the uncertainty of an optimal simulation parameter setting.7.
Background
Identification of phosphorylation sites by computational methods is becoming increasingly important because it reduces labor-intensive and costly experiments and can improve our understanding of the common properties and underlying mechanisms of protein phosphorylation.Methods
A multitask learning framework for learning four kinase families simultaneously, instead of studying each kinase family of phosphorylation sites separately, is presented in the study. The framework includes two multitask classification methods: the Multi-Task Least Squares Support Vector Machines (MTLS-SVMs) and the Multi-Task Feature Selection (MT-Feat3).Results
Using the multitask learning framework, we successfully identify 18 common features shared by four kinase families of phosphorylation sites. The reliability of selected features is demonstrated by the consistent performance in two multi-task learning methods.Conclusions
The selected features can be used to build efficient multitask classifiers with good performance, suggesting they are important to protein phosphorylation across 4 kinase families.8.
Emilia Daghir-Wojtkowiak Paweł Wiczling Małgorzata Waszczuk-Jankowska Roman Kaliszan Michał Jan Markuszewski 《Metabolomics : Official journal of the Metabolomic Society》2017,13(3):31
Introduction
Multilevel modeling is a quantitative statistical method to investigate variability and relationships between variables of interest, taking into account population structure and dependencies. It can be used for prediction, data reduction and causal inference from experiments and observational studies allowing for more efficient elucidation of knowledge.Objectives
In this study we introduced the concept of multilevel pharmacokinetics (PK)-driven modelling for large-sample, unbalanced and unadjusted metabolomics data comprising nucleoside and creatinine concentration measurements in urine of healthy and cancer patients.Methods
A Bayesian multilevel model was proposed to describe the nucleoside and creatinine concentration ratio considering age, sex and health status as covariates. The predictive performance of the proposed model was summarized via area under the ROC, sensitivity and specificity using external validation.Results
Cancer was associated with an increase in methylthioadenosine/creatinine excretion rate by a factor of 1.42 (1.09–2.03) which constituted the highest increase among all nucleosides. Age influenced nucleosides/creatinine excretion rates for all nucleosides in the same direction which was likely caused by a decrease in creatinine clearance with age. There was a small evidence of sex-related differences for methylthioadenosine. The individual a posteriori prediction of patient classification as area under the ROC with 5th and 95th percentile was 0.57(0.5–0.67) with sensitivity and specificity of 0.59(0.42–0.76) and 0.57(0.45–0.7), respectively suggesting limited usefulness of 13 nucleosides/creatinine urine concentration measurements in predicting disease in this population.Conclusion
Bayesian multilevel pharmacokinetics-driven modeling in metabolomics may be useful in understanding the data and may constitute a new tool for searching towards potential candidates of disease indicators.9.
10.
Yingfeng Wang Xutao Wang Xiaoqin Zeng 《Metabolomics : Official journal of the Metabolomic Society》2017,13(10):116
Introduction
Tandem mass spectrometry (MS/MS) has been widely used for identifying metabolites in many areas. However, computationally identifying metabolites from MS/MS data is challenging due to the unknown of fragmentation rules, which determine the precedence of chemical bond dissociation. Although this problem has been tackled by different ways, the lack of computational tools to flexibly represent adjacent structures of chemical bonds is still a long-term bottleneck for studying fragmentation rules.Objectives
This study aimed to develop computational methods for investigating fragmentation rules by analyzing annotated MS/MS data.Methods
We implemented a computational platform, MIDAS-G, for investigating fragmentation rules. MIDAS-G processes a metabolite as a simple graph and uses graph grammars to recognize specific chemical bonds and their adjacent structures. We can apply MIDAS-G to investigate fragmentation rules by adjusting bond weights in the scoring model of the metabolite identification tool and comparing metabolite identification performances.Results
We used MIDAS-G to investigate four bond types on real annotated MS/MS data in experiments. The experimental results matched data collected from wet labs and literature. The effectiveness of MIDAS-G was confirmed.Conclusion
We developed a computational platform for investigating fragmentation rules of tandem mass spectrometry. This platform is freely available for download.11.
Background
Inferring species trees from gene trees using the coalescent-based summary methods has been the subject of much attention, yet new scalable and accurate methods are needed.Results
We introduce DISTIQUE, a new statistically consistent summary method for inferring species trees from gene trees under the coalescent model. We generalize our results to arbitrary phylogenetic inference problems; we show that two arbitrarily chosen leaves, called anchors, can be used to estimate relative distances between all other pairs of leaves by inferring relevant quartet trees. This results in a family of distance-based tree inference methods, with running times ranging between quadratic to quartic in the number of leaves.Conclusions
We show in simulated studies that DISTIQUE has comparable accuracy to leading coalescent-based summary methods and reduced running times.12.
N. Cesbron A.-L. Royer Y. Guitton A. Sydor B. Le Bizec G. Dervilly-Pinel 《Metabolomics : Official journal of the Metabolomic Society》2017,13(8):99
Introduction
Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.Objectives
In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.Methods
The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.Results
A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.Conclusion
The workflow generated repeatable and informative fingerprints for robust metabolome characterization.13.
Background
Mixtures of beta distributions are a flexible tool for modeling data with values on the unit interval, such as methylation levels. However, maximum likelihood parameter estimation with beta distributions suffers from problems because of singularities in the log-likelihood function if some observations take the values 0 or 1.Methods
While ad-hoc corrections have been proposed to mitigate this problem, we propose a different approach to parameter estimation for beta mixtures where such problems do not arise in the first place. Our algorithm combines latent variables with the method of moments instead of maximum likelihood, which has computational advantages over the popular EM algorithm.Results
As an application, we demonstrate that methylation state classification is more accurate when using adaptive thresholds from beta mixtures than non-adaptive thresholds on observed methylation levels. We also demonstrate that we can accurately infer the number of mixture components.Conclusions
The hybrid algorithm between likelihood-based component un-mixing and moment-based parameter estimation is a robust and efficient method for beta mixture estimation. We provide an implementation of the method (“betamix”) as open source software under the MIT license.14.
Background
The reconstruction of gene regulatory network (GRN) from gene expression data can discover regulatory relationships among genes and gain deep insights into the complicated regulation mechanism of life. However, it is still a great challenge in systems biology and bioinformatics. During the past years, numerous computational approaches have been developed for this goal, and Bayesian network (BN) methods draw most of attention among these methods because of its inherent probability characteristics. However, Bayesian network methods are time consuming and cannot handle large-scale networks due to their high computational complexity, while the mutual information-based methods are highly effective but directionless and have a high false-positive rate.Results
To solve these problems, we propose a Candidate Auto Selection algorithm (CAS) based on mutual information and breakpoint detection to restrict the search space in order to accelerate the learning process of Bayesian network. First, the proposed CAS algorithm automatically selects the neighbor candidates of each node before searching the best structure of GRN. Then based on CAS algorithm, we propose a globally optimal greedy search method (CAS + G), which focuses on finding the highest rated network structure, and a local learning method (CAS + L), which focuses on faster learning the structure with little loss of quality.Conclusion
Results show that the proposed CAS algorithm can effectively reduce the search space of Bayesian networks through identifying the neighbor candidates of each node. In our experiments, the CAS + G method outperforms the state-of-the-art method on simulation data for inferring GRNs, and the CAS + L method is significantly faster than the state-of-the-art method with little loss of accuracy. Hence, the CAS based methods effectively decrease the computational complexity of Bayesian network and are more suitable for GRN inference.15.
Background
Longitudinal measurement is commonly employed in health research and provides numerous benefits for understanding disease and trait progression over time. More broadly, it allows for proper treatment of correlated responses within clusters. We evaluated 3 methods for analyzing genome-by-epigenome interactions with longitudinal outcomes from family data.Results
Linear mixed-effect models, generalized estimating equations, and quadratic inference functions were used to test a pharmacoepigenetic effect in 200 simulated posttreatment replicates. Adjustment for baseline outcome provided greater power and more accurate control of Type I error rates than computation of a pre-to-post change score.Conclusions
Comparison of all modeling approaches indicated a need for bias correction in marginal models and similar power for each method, with quadratic inference functions providing a minor decrement in power compared to generalized estimating equations and linear mixed-effects models.16.
Mateusz Kurcinski Maciej Blaszczyk Maciej Pawel Ciemny Andrzej Kolinski Sebastian Kmiecik 《Biomedical engineering online》2017,16(1):73
Background
The characterization of protein–peptide interactions is a challenge for computational molecular docking. Protein–peptide docking tools face at least two major difficulties: (1) efficient sampling of large-scale conformational changes induced by binding and (2) selection of the best models from a large set of predicted structures. In this paper, we merge an efficient sampling technique with external information about side-chain contacts to sample and select the best possible models.Methods
In this paper we test a new protocol that uses information about side-chain contacts in CABS-dock protein–peptide docking. As shown in our recent studies, CABS-dock enables efficient modeling of large-scale conformational changes without knowledge about the binding site. However, the resulting set of binding sites and poses is in many cases highly diverse and difficult to score.Results
As we demonstrate here, information about a single side-chain contact can significantly improve the prediction accuracy. Importantly, the imposed constraints for side-chain contacts are quite soft. Therefore, the developed protocol does not require precise contact information and ensures large-scale peptide flexibility in the broad contact area.Conclusions
The demonstrated protocol provides the extension of the CABS-dock method that can be practically used in the structure prediction of protein–peptide complexes guided by the knowledge of the binding interface.17.
Rachel A. Spicer Christoph Steinbeck 《Metabolomics : Official journal of the Metabolomic Society》2018,14(1):16
Introduction
Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.Objectives
(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.Methods
A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.Results
Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.Conclusion
Further efforts are required to improve data sharing in metabolomics.18.
Antonio Rosato Leonardo Tenori Marta Cascante Pedro Ramon De Atauri Carulla Vitor A. P. Martins dos Santos Edoardo Saccenti 《Metabolomics : Official journal of the Metabolomic Society》2018,14(4):37
Introduction
Metabolomics is a well-established tool in systems biology, especially in the top–down approach. Metabolomics experiments often results in discovery studies that provide intriguing biological hypotheses but rarely offer mechanistic explanation of such findings. In this light, the interpretation of metabolomics data can be boosted by deploying systems biology approaches.Objectives
This review aims to provide an overview of systems biology approaches that are relevant to metabolomics and to discuss some successful applications of these methods.Methods
We review the most recent applications of systems biology tools in the field of metabolomics, such as network inference and analysis, metabolic modelling and pathways analysis.Results
We offer an ample overview of systems biology tools that can be applied to address metabolomics problems. The characteristics and application results of these tools are discussed also in a comparative manner.Conclusions
Systems biology-enhanced analysis of metabolomics data can provide insights into the molecular mechanisms originating the observed metabolic profiles and enhance the scientific impact of metabolomics studies.19.
Chia-Chi Wang Yuan-Chung Lin Syu-Ruei Jhang Chun-Wei Tung 《Biomedical engineering online》2017,16(1):66
Background
The immunotoxicity of engine exhausts is of high concern to human health due to the increasing prevalence of immune-related diseases. However, the evaluation of immunotoxicity of engine exhausts is currently based on expensive and time-consuming experiments. It is desirable to develop efficient methods for immunotoxicity assessment.Methods
To accelerate the development of safe alternative fuels, this study proposed a computational method for identifying informative features for predicting proinflammatory potentials of engine exhausts. A principal component regression (PCR) algorithm was applied to develop prediction models. The informative features were identified by a sequential backward feature elimination (SBFE) algorithm.Results
A total of 19 informative chemical and biological features were successfully identified by SBFE algorithm. The informative features were utilized to develop a computational method named FS-CBM for predicting proinflammatory potentials of engine exhausts. FS-CBM model achieved a high performance with correlation coefficient values of 0.997 and 0.943 obtained from training and independent test sets, respectively.Conclusions
The FS-CBM model was developed for predicting proinflammatory potentials of engine exhausts with a large improvement on prediction performance compared with our previous CBM model. The proposed method could be further applied to construct models for bioactivities of mixtures.20.