首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Elucidating the structure and/or dynamics of gene regulatory networks from experimental data is a major goal of systems biology. Stochastic models have the potential to absorb noise, account for un-certainty, and help avoid data overfitting. Within the frame work of probabilistic polynomial dynamical systems, we present an algorithm for the reverse engineering of any gene regulatory network as a discrete, probabilistic polynomial dynamical system. The resulting stochastic model is assembled from all minimal models in the model space and the probability assignment is based on partitioning the model space according to the likeliness with which a minimal model explains the observed data. We used this method to identify stochastic models for two published synthetic network models. In both cases, the generated model retains the key features of the original model and compares favorably to the resulting models from other algorithms.  相似文献   

2.
Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions. Although gene set analysis has been very successful, by incorporating biological knowledge about the gene sets and enhancing statistical power over gene-by-gene analyses, it does not take into account the correlation (association) structure among the genes. In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes. The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs. The package also includes common measures to compare gene co-expression networks in terms of their structural properties, such as centrality, degree distribution, shortest path length, and clustering coefficient. Besides the structural analyses, CoGA also includes graphical interfaces for visual inspection of the networks, ranking of genes according to their “importance” in the network, and the standard differential expression analysis. We show by both simulation experiments and analyses of real data that the statistical tests performed by CoGA indeed control the rate of false positives and is able to identify differentially co-expressed genes that other methods failed.  相似文献   

3.
Integrating physical knowledge and machine learning is a critical aspect of developing industrially focused digital twins for monitoring, optimisation, and design of microalgal and cyanobacterial photo-production processes. However, identifying the correct model structure to quantify the complex biological mechanism poses a severe challenge for the construction of kinetic models, while the lack of data due to the time-consuming experiments greatly impedes applications of most data-driven models. This study proposes the use of an innovative hybrid modelling approach that consists of a simple kinetic model to govern the overall process dynamic trajectory and a data-driven model to estimate mismatch between the kinetic equations and the real process. An advanced automatic model structure identification strategy is adopted to simultaneously identify the most physically probable kinetic model structure and minimum number of data-driven model parameters that can accurately represent multiple data sets over a broad spectrum of process operating conditions. Through this hybrid modelling and automatic structure identification framework, a highly accurate mathematical model was constructed to simulate and optimise an algal lutein production process. Performance of this hybrid model for long-term predictive modelling, optimisation, and online self-calibration is demonstrated and thoroughly discussed, indicating its significant potential for future industrial application.  相似文献   

4.
Dynamical modeling has proven useful for understanding how complex biological processes emerge from the many components and interactions composing genetic regulatory networks (GRNs). However, the development of models is hampered by large uncertainties in both the network structure and parameter values. To remedy this problem, the models are usually developed through an iterative process based on numerous simulations, confronting model predictions with experimental data and refining the model structure and/or parameter values to repair the inconsistencies. In this paper, we propose an alternative to this generate-and-test approach. We present a four-step method for the systematic construction and analysis of discrete models of GRNs by means of a declarative approach. Instead of instantiating the models as in classical modeling approaches, the biological knowledge on the network structure and its dynamics is formulated in the form of constraints. The compatibility of the network structure with the constraints is queried and in case of inconsistencies, some constraints are relaxed. Common properties of the consistent models are then analyzed by means of dedicated languages. Two such languages are introduced in the paper. Removing questionable constraints or adding interesting ones allows to further analyze the models. This approach allows to identify the best experiments to be carried out, in order to discriminate sets of consistent models and refine our knowledge on the system functioning. We test the feasibility of our approach, by applying it to the re-examination of a model describing the nutritional stress response in the bacterium Escherichia coli.  相似文献   

5.
In the serial gray box modeling strategy, generally available knowledge, represented in the macroscopic balance, is combined naturally with neural networks, which are powerful and convenient tools to model the inaccurately known terms in the macroscopic balance. This article shows, for a typical biochemical conversion, that in the serial gray box modeling strategy the identification data only have to cover the input-output space of the inaccurately known term in the macroscopic balances and that the accurately known terms can be used to achieve reliable extrapolation. The strategy is demonstrated successfully on the modeling of the enzymatic (repeated) batch conversion of penicillin G, for which real-time results are presented. Compared with a more data-driven black box strategy, the serial gray box strategy leads to models with reliable extrapolation properties, so that with the same number of identification experiments the model can be applied to a much wider range of different conditions. Compared to a more knowledge-driven white box strategy, the serial gray box model structure is only based on readily available or easily obtainable knowledge, so that the development time of serial gray box models still may be short in a situation where there is no detailed knowledge of the system available. (c) 1997 John Wiley & Sons, Inc. Biotechnol Bioeng 53: 549-566, 1997.  相似文献   

6.
The reverse engineering of metabolic networks from experimental data is traditionally a labor-intensive task requiring a priori systems knowledge. Using a proven model as a test system, we demonstrate an automated method to simplify this process by modifying an existing or related model--suggesting nonlinear terms and structural modifications--or even constructing a new model that agrees with the system's time series observations. In certain cases, this method can identify the full dynamical model from scratch without prior knowledge or structural assumptions. The algorithm selects between multiple candidate models by designing experiments to make their predictions disagree. We performed computational experiments to analyze a nonlinear seven-dimensional model of yeast glycolytic oscillations. This approach corrected mistakes reliably in both approximated and overspecified models. The method performed well to high levels of noise for most states, could identify the correct model de novo, and make better predictions than ordinary parametric regression and neural network models. We identified an invariant quantity in the model, which accurately derived kinetics and the numerical sensitivity coefficients of the system. Finally, we compared the system to dynamic flux estimation and discussed the scaling and application of this methodology to automated experiment design and control in biological systems in real time.  相似文献   

7.
In areas such as drug development, clinical diagnosis and biotechnology research, acquiring details about the kinetic parameters of enzymes is crucial. The correct design of an experiment is critical to collecting data suitable for analysis, modelling and deriving the correct information. As classical design methods are not targeted to the more complex kinetics being frequently studied, attention is needed to estimate parameters of such models with low variance. We demonstrate that a Bayesian approach (the use of prior knowledge) can produce major gains quantifiable in terms of information, productivity and accuracy of each experiment. Developing the use of Bayesian Utility functions, we have used a systematic method to identify the optimum experimental designs for a number of kinetic model data sets. This has enabled the identification of trends between kinetic model types, sets of design rules and the key conclusion that such designs should be based on some prior knowledge of K(M) and/or the kinetic model. We suggest an optimal and iterative method for selecting features of the design such as the substrate range, number of measurements and choice of intermediate points. The final design collects data suitable for accurate modelling and analysis and minimises the error in the parameters estimated.  相似文献   

8.
Large-scale microarray gene expression data provide the possibility of constructing genetic networks or biological pathways. Gaussian graphical models have been suggested to provide an effective method for constructing such genetic networks. However, most of the available methods for constructing Gaussian graphs do not account for the sparsity of the networks and are computationally more demanding or infeasible, especially in the settings of high dimension and low sample size. We introduce a threshold gradient descent (TGD) regularization procedure for estimating the sparse precision matrix in the setting of Gaussian graphical models and demonstrate its application to identifying genetic networks. Such a procedure is computationally feasible and can easily incorporate prior biological knowledge about the network structure. Simulation results indicate that the proposed method yields a better estimate of the precision matrix than the procedures that fail to account for the sparsity of the graphs. We also present the results on inference of a gene network for isoprenoid biosynthesis in Arabidopsis thaliana. These results demonstrate that the proposed procedure can indeed identify biologically meaningful genetic networks based on microarray gene expression data.  相似文献   

9.
Identifying the structure and dynamics of synaptic interactions between neurons is the first step to understanding neural network dynamics. The presence of synaptic connections is traditionally inferred through the use of targeted stimulation and paired recordings or by post-hoc histology. More recently, causal network inference algorithms have been proposed to deduce connectivity directly from electrophysiological signals, such as extracellularly recorded spiking activity. Usually, these algorithms have not been validated on a neurophysiological data set for which the actual circuitry is known. Recent work has shown that traditional network inference algorithms based on linear models typically fail to identify the correct coupling of a small central pattern generating circuit in the stomatogastric ganglion of the crab Cancer borealis. In this work, we show that point process models of observed spike trains can guide inference of relative connectivity estimates that match the known physiological connectivity of the central pattern generator up to a choice of threshold. We elucidate the necessary steps to derive faithful connectivity estimates from a model that incorporates the spike train nature of the data. We then apply the model to measure changes in the effective connectivity pattern in response to two pharmacological interventions, which affect both intrinsic neural dynamics and synaptic transmission. Our results provide the first successful application of a network inference algorithm to a circuit for which the actual physiological synapses between neurons are known. The point process methodology presented here generalizes well to larger networks and can describe the statistics of neural populations. In general we show that advanced statistical models allow for the characterization of effective network structure, deciphering underlying network dynamics and estimating information-processing capabilities.  相似文献   

10.
Cho KH  Choo SM  Wellstead P  Wolkenhauer O 《FEBS letters》2005,579(20):4520-4528
We propose a unified framework for the identification of functional interaction structures of biomolecular networks in a way that leads to a new experimental design procedure. In developing our approach, we have built upon previous work. Thus we begin by pointing out some of the restrictions associated with existing structure identification methods and point out how these restrictions may be eased. In particular, existing methods use specific forms of experimental algebraic equations with which to identify the functional interaction structure of a biomolecular network. In our work, we employ an extended form of these experimental algebraic equations which, while retaining their merits, also overcome some of their disadvantages. Experimental data are required in order to estimate the coefficients of the experimental algebraic equation set associated with the structure identification task. However, experimentalists are rarely provided with guidance on which parameters to perturb, and to what extent, to perturb them. When a model of network dynamics is required then there is also the vexed question of sample rate and sample time selection to be resolved. Supplying some answers to these questions is the main motivation of this paper. The approach is based on stationary and/or temporal data obtained from parameter perturbations, and unifies the previous approaches of Kholodenko et al. (PNAS 99 (2002) 12841-12846) and Sontag et al. (Bioinformatics 20 (2004) 1877-1886). By way of demonstration, we apply our unified approach to a network model which cannot be properly identified by existing methods. Finally, we propose an experiment design methodology, which is not limited by the amount of parameter perturbations, and illustrate its use with an in numero example.  相似文献   

11.
Multi-protein machines are responsible for most cellular tasks, and many efforts have been invested in the systematic identification and characterization of thousands of these macromolecular assemblies. However, unfortunately, the (quasi) atomic details necessary to understand their function are available only for a tiny fraction of the known complexes. The computational biology community is developing strategies to integrate structural data of different nature, from electron microscopy to X-ray crystallography, to model large molecular machines, as it has been done for individual proteins and interactions with remarkable success. However, unlike for binary interactions, there is no reliable gold-standard set of three-dimensional (3D) complexes to benchmark the performance of these methodologies and detect their limitations. Here, we present a strategy to dynamically generate non-redundant sets of 3D heteromeric complexes with three or more components. By changing the values of sequence identity and component overlap between assemblies required to define complex redundancy, we can create sets of representative complexes with known 3D structure (i.e., target complexes). Using an identity threshold of 20% and imposing a fraction of component overlap of < 0.5, we identify 495 unique target complexes, which represent a real non-redundant set of heteromeric assemblies with known 3D structure. Moreover, for each target complex, we also identify a set of assemblies, of varying degrees of identity and component overlap, that can be readily used as input in a complex modeling exercise (i.e., template subcomplexes). We hope that resources like this will significantly help the development and progress assessment of novel methodologies, as docking benchmarks and blind prediction contests did. The interactive resource is accessible at https://DynBench3D.irbbarcelona.org.  相似文献   

12.
Gas chromatographic fatty acid methyl ester analysis of bacteria is an easy, cheap and fast-automated identification tool routinely used in microbiological research. This paper reports on the application of artificial neural networks for genus-wide FAME-based identification of Bacillus species. Using 1,071 FAME profiles covering a genus-wide spectrum of 477 strains and 82 species, different balanced and imbalanced data sets have been created according to different validation methods and model parameters. Following training and validation, each classifier was evaluated on its ability to identify the profiles of a test set. Comparison of the classifiers showed a good identification rate favoring the imbalanced data sets. The presence of the Bacillus cereus and Bacillus subtilis groups made clear that it is of great importance to take into account the limitations of FAME analysis resolution for the construction of identification models. Indeed, as members of such a group cannot easily be distinguished from one another based upon FAME data alone, identification models built upon this data can neither be successful at keeping them apart. Comparison of the different experimental setups ultimately led to a few general recommendations. With respect to the routinely used commercial Sherlock Microbial Identification System (MIS, Microbial ID, Inc. (MIDI), Newark, Delaware, USA), the artificial neural network test results showed a significant improvement in Bacillus species identification. These results indicate that machine learning techniques such as artificial neural networks are most promising tools for FAME-based classification and identification of bacterial species.  相似文献   

13.

Background

Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of ‘omics’ data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis.

Results

We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions.

Conclusions

Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0386-y) contains supplementary material, which is available to authorized users.  相似文献   

14.
生态网络分析方法研究综述   总被引:13,自引:8,他引:5  
李中才  徐俊艳  吴昌友  张漪 《生态学报》2011,31(18):5396-5405
生态网络分析方法是分析生态系统作用关系、辨识系统内在、整体属性的一种有效的系统分析方法。总结了生态网络分析方法的主要研究成果:网络结构特性、网络稳定性、网络上升性、网络效能等;介绍了构建生态网络模型过程和群落构建规则;以德国西部城市诺伊斯河口氮循环为例,介绍David K是如何运用生态网络分析方法来揭示网络中的微动力流循环规律。生态网络分析方法的主要贡献:(1)对人们凭经验感知的生态系统分室间的关联关系,采用了严密的数学模型和推导进行了描述和证明;(2)为生态系统的微动力流循环的研究提供了方法,对生态系统中物质流的间接循环作用进行了科学论证;(3)不仅为分析生态系统提供了一种科学的数学方法,而且,它为探索生态系统提供了不同与牛顿世界观的崭新的认识论。总结与回顾生态网络分析方法,有益于该方法的运用和进一步完善。  相似文献   

15.

Background

Living systems are associated with Social networks — networks made up of nodes, some of which may be more important in various aspects as compared to others. While different quantitative measures labeled as “centralities” have previously been used in the network analysis community to find out influential nodes in a network, it is debatable how valid the centrality measures actually are. In other words, the research question that remains unanswered is: how exactly do these measures perform in the real world? So, as an example, if a centrality of a particular node identifies it to be important, is the node actually important?

Purpose

The goal of this paper is not just to perform a traditional social network analysis but rather to evaluate different centrality measures by conducting an empirical study analyzing exactly how do network centralities correlate with data from published multidisciplinary network data sets.

Method

We take standard published network data sets while using a random network to establish a baseline. These data sets included the Zachary''s Karate Club network, dolphin social network and a neural network of nematode Caenorhabditis elegans. Each of the data sets was analyzed in terms of different centrality measures and compared with existing knowledge from associated published articles to review the role of each centrality measure in the determination of influential nodes.

Results

Our empirical analysis demonstrates that in the chosen network data sets, nodes which had a high Closeness Centrality also had a high Eccentricity Centrality. Likewise high Degree Centrality also correlated closely with a high Eigenvector Centrality. Whereas Betweenness Centrality varied according to network topology and did not demonstrate any noticeable pattern. In terms of identification of key nodes, we discovered that as compared with other centrality measures, Eigenvector and Eccentricity Centralities were better able to identify important nodes.  相似文献   

16.
Resolution of kinetic equations and parameter identification are discussed for n-compartment linear catenary models with elimination allowed from any compartment. For a given input, general formulas are derived to describe the tracer amount in any compartment as a function of the model parameters. Conversely, explicit procedures are given to identify the model parameters when the concentration-time curve is known in one arbitrary compartment, the tracer being injected into the same compartment. In this inverse problem, the solution is not unique: the model transfer rate constants can only be localized in a finite set of intervals.  相似文献   

17.
The new procedure for constructing a Wagner network presented differs from Farris’s (1970) method in that the amount of computation required is reduced. The usefulness of this procedure was examined by applying it to the 20 characters considered in a recent monograph of the seven OTUs of the genusPentachaeta. A single network was derived from some 945 or more networks possible for this group. A comparison of the network constructed by this simplified method to that constructed by Farris’ procedure revealed no differences. An attempt to reconstruct the cladistic history of this group by generating a Wagner tree based on the network resulted in four equally possible trees, suggesting that further data are needed before cladogenesis in this group is resolved.  相似文献   

18.
Systems research spanning fields from biology to finance involves the identification of models to represent the underpinnings of complex systems. Formal approaches for data-driven identification of network interactions include statistical inference-based approaches and methods to identify dynamical systems models that are capable of fitting multivariate data. Availability of large data sets and so-called ‘big data’ applications in biology present great opportunities as well as major challenges for systems identification/reverse engineering applications. For example, both inverse identification and forward simulations of genome-scale gene regulatory network models pose compute-intensive problems. This issue is addressed here by combining the processing power of Graphics Processing Units (GPUs) and a parallel reverse engineering algorithm for inference of regulatory networks. It is shown that, given an appropriate data set, information on genome-scale networks (systems of 1000 or more state variables) can be inferred using a reverse-engineering algorithm in a matter of days on a small-scale modern GPU cluster.  相似文献   

19.
The application of joint contact mechanics requires a precise configuration of the joint surfaces. B-Spline, and NURBS have been widely used to model joint surfaces, but because these formulations use a structured data set provided by a rectangular net first, then a grid, there is a limit to the accuracy of the models they can produce. However new imaging systems such as 3D laser scanners can provide more realistic unstructured data sets. What is needed is a method to manipulate the unstructured data. We created a parametric polynomial function and applied it to unstructured data sets obtained by scanning joint surfaces. We applied our polynomial model to unstructured data sets from an artificial joint, and confirmed that our polynomial produced a smoother and more accurate model than the conventional B-spline method. Next, we applied it to a diarthrodial joint surface containing many ripples, and found that our function's noise filtering characteristics smoothed out existing ripples. Since no formulation was found to be optimal for all applications, we used two formulations to model surfaces with ripples. First, we used our polynomial to describe the global shape of the objective surface. Minute undulations were then specifically approximated with a Fourier series function. Finally, both approximated surfaces were superimposed to reproduce the original surface in a complete fashion.  相似文献   

20.
High-throughput molecular analysis has become an integral part in organismal systems biology. In contrast, due to a missing systematic linkage of the data with functional and predictive theoretical models of the underlying metabolic network the understanding of the resulting complex data sets is lacking far behind. Here, we present a biomathematical method addressing this problem by using metabolomics data for the inverse calculation of a biochemical Jacobian matrix, thereby linking computer-based genome-scale metabolic reconstruction and in vivo metabolic dynamics. The incongruity of metabolome coverage by typical metabolite profiling approaches and genome-scale metabolic reconstruction was solved by the design of superpathways to define a metabolic interaction matrix. A differential biochemical Jacobian was calculated using an approach which links this metabolic interaction matrix and the covariance of metabolomics data satisfying a Lyapunov equation. The predictions of the differential Jacobian from real metabolomic data were found to be correct by testing the corresponding enzymatic activities. Moreover it is demonstrated that the predictions of the biochemical Jacobian matrix allow for the design of parameter optimization strategies for ODE-based kinetic models of the system. The presented concept combines dynamic modelling strategies with large-scale steady state profiling approaches without the explicit knowledge of individual kinetic parameters. In summary, the presented strategy allows for the identification of regulatory key processes in the biochemical network directly from metabolomics data and is a fundamental achievement for the functional interpretation of metabolomics data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号