The ensemble modeling (EM) approach has shown promise in capturing kinetic and regulatory effects in the modeling of metabolic networks. Efficacy of the EM procedure relies on the identification of model parameterizations that adequately describe all observed metabolic phenotypes upon perturbation. In this study, we propose an optimization-based algorithm for the systematic identification of genetic/enzyme perturbations to maximally reduce the number of models retained in the ensemble after each round of model screening. The key premise here is to design perturbations that will maximally scatter the predicted steady-state fluxes over the ensemble parameterizations. We demonstrate the applicability of this procedure for an Escherichia coli metabolic model of central metabolism by successively identifying single, double, and triple enzyme perturbations that cause the maximum degree of flux separation between models in the ensemble. Results revealed that optimal perturbations are not always located close to reaction(s) whose fluxes are measured, especially when multiple perturbations are considered. In addition, there appears to be a maximum number of simultaneous perturbations beyond which no appreciable increase in the divergence of flux predictions is achieved. Overall, this study provides a systematic way of optimally designing genetic perturbations for populating the ensemble of models with relevant model parameterizations. 相似文献
探讨原发性肝癌患者精确放疗后乙型肝炎病毒(hepatitis b virus,HBV)再激活的危险特征和分类预测模型。提出基于遗传算法的特征选择方法,从原发性肝癌数据的初始特征集中选择HBV再激活的最优特征子集。建立贝叶斯和支持向量机的HBV再激活分类预测模型,并预测最优特征子集和初始特征集的分类性能。实验结果表明,基于遗传算法的特征选择提高了HBV再激活分类性能,最优特征子集的分类性能明显优于初始特征子集的分类性能。影响HBV再激活的最优特征子集包括:HBV DNA水平,肿瘤分期TNM,Child-Pugh,外放边界和全肝最大剂量。贝叶斯的分类准确性最高可达82.89%,支持向量机的分类准确性最高可达83.34%。 相似文献
Introduction: Despite the unquestionable advantages of Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry Imaging in visualizing the spatial distribution and the relative abundance of biomolecules directly on-tissue, the yielded data is complex and high dimensional. Therefore, analysis and interpretation of this huge amount of information is mathematically, statistically and computationally challenging.
Areas covered: This article reviews some of the challenges in data elaboration with particular emphasis on machine learning techniques employed in clinical applications, and can be useful in general as an entry point for those who want to study the computational aspects. Several characteristics of data processing are described, enlightening advantages and disadvantages. Different approaches for data elaboration focused on clinical applications are also provided. Practical tutorial based upon Orange Canvas and Weka software is included, helping familiarization with the data processing.
Expert commentary: Recently, MALDI-MSI has gained considerable attention and has been employed for research and diagnostic purposes, with successful results. Data dimensionality constitutes an important issue and statistical methods for information-preserving data reduction represent one of the most challenging aspects. The most common data reduction methods are characterized by collecting independent observations into a single table. However, the incorporation of relational information can improve the discriminatory capability of the data. 相似文献