首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Abstract

The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.
  相似文献   

2.
In this paper, the extended Kalman filter (EKF) algorithm is applied to model the gene regulatory network from gene time series data. The gene regulatory network is considered as a nonlinear dynamic stochastic model that consists of the gene measurement equation and the gene regulation equation. After specifying the model structure, we apply the EKF algorithm for identifying both the model parameters and the actual value of gene expression levels. It is shown that the EKF algorithm is an online estimation algorithm that can identify a large number of parameters (including parameters of nonlinear functions) through iterative procedure by using a small number of observations. Four real-world gene expression data sets are employed to demonstrate the effectiveness of the EKF algorithm, and the obtained models are evaluated from the viewpoint of bioinformatics.  相似文献   

3.
Data assimilation is a valuable tool in the study of any complex system, where measurements are incomplete, uncertain, or both. It enables the user to take advantage of all available information including experimental measurements and short-term model forecasts of a system. Although data assimilation has been used to study other biological systems, the study of the sleep-wake regulatory network has yet to benefit from this toolset. We present a data assimilation framework based on the unscented Kalman filter (UKF) for combining sparse measurements together with a relatively high-dimensional nonlinear computational model to estimate the state of a model of the sleep-wake regulatory system. We demonstrate with simulation studies that a few noisy variables can be used to accurately reconstruct the remaining hidden variables. We introduce a metric for ranking relative partial observability of computational models, within the UKF framework, that allows us to choose the optimal variables for measurement and also provides a methodology for optimizing framework parameters such as UKF covariance inflation. In addition, we demonstrate a parameter estimation method that allows us to track non-stationary model parameters and accommodate slow dynamics not included in the UKF filter model. Finally, we show that we can even use observed discretized sleep-state, which is not one of the model variables, to reconstruct model state and estimate unknown parameters. Sleep is implicated in many neurological disorders from epilepsy to schizophrenia, but simultaneous observation of the many brain components that regulate this behavior is difficult. We anticipate that this data assimilation framework will enable better understanding of the detailed interactions governing sleep and wake behavior and provide for better, more targeted, therapies.  相似文献   

4.

Background  

A reverse engineering of gene regulatory network with large number of genes and limited number of experimental data points is a computationally challenging task. In particular, reverse engineering using linear systems is an underdetermined and ill conditioned problem, i.e. the amount of microarray data is limited and the solution is very sensitive to noise in the data. Therefore, the reverse engineering of gene regulatory networks with large number of genes and limited number of data points requires rigorous optimization algorithm.  相似文献   

5.
The training of neural networks using the extended Kalman filter (EKF) algorithm is plagued by the drawback of high computational complexity and storage requirement that may become prohibitive even for networks of moderate size. In this paper, we present a local EKF training and pruning approach that can solve this problem. In particular, the by-products obtained along with the local EKF training can be utilized to measure the importance of the network weights. Comparing with the original global approach, the proposed local EKF training and pruning approach results in a much lower computational complexity and storage requirement. Hence, it is more practical in solving real world problems. The performance of the proposed algorithm is demonstrated on one medium- and one large-scale problems, namely, sunspot data prediction and handwritten digit recognition.  相似文献   

6.
Deng X  Geng H  Ali H 《Bio Systems》2005,81(2):125-136
Reverse-engineering of gene networks using linear models often results in an underdetermined system because of excessive unknown parameters. In addition, the practical utility of linear models has remained unclear. We address these problems by developing an improved method, EXpression Array MINing Engine (EXAMINE), to infer gene regulatory networks from time-series gene expression data sets. EXAMINE takes advantage of sparse graph theory to overcome the excessive-parameter problem with an adaptive-connectivity model and fitting algorithm. EXAMINE also guarantees that the most parsimonious network structure will be found with its incremental adaptive fitting process. Compared to previous linear models, where a fully connected model is used, EXAMINE reduces the number of parameters by O(N), thereby increasing the chance of recovering the underlying regulatory network. The fitting algorithm increments the connectivity during the fitting process until a satisfactory fit is obtained. We performed a systematic study to explore the data mining ability of linear models. A guideline for using linear models is provided: If the system is small (3-20 elements), more than 90% of the regulation pathways can be determined correctly. For a large-scale system, either clustering is needed or it is necessary to integrate information in addition to expression profile. Coupled with the clustering method, we applied EXAMINE to rat central nervous system development (CNS) data with 112 genes. We were able to efficiently generate regulatory networks with statistically significant pathways that have been predicted previously.  相似文献   

7.
Darvish A  Najarian K 《Bio Systems》2006,83(2-3):125-135
We propose a novel technique that constructs gene regulatory networks from DNA microarray data and gene-protein databases and then applies Mason rule to systematically search for the most dominant regulators of the network. The algorithm then recommends the identified dominant regulator genes as the best candidates for future knock-out experiments. Actively choosing the genes for knock-out experiments allows optimal perturbation of the pathway and therefore produces the most informative DNA microarray data for pathway identification purposes. This approach is more practically advantageous in analysis of large pathways where the time and cost of DNA microarray data experiments can be reduced using the proposed optimal experiment design. The proposed method was successfully tested on the galactose regulatory network.  相似文献   

8.
9.
The primary goal of this article is to infer genetic interactions based on gene expression data. A new method for multiorganism Bayesian gene network estimation is presented based on multitask learning. When the input datasets are sparse, as is the case in microarray gene expression data, it becomes difficult to separate random correlations from true correlations that would lead to actual edges when modeling the gene interactions as a Bayesian network. Multitask learning takes advantage of the similarity between related tasks, in order to construct a more accurate model of the underlying relationships represented by the Bayesian networks. The proposed method is tested on synthetic data to illustrate its validity. Then it is iteratively applied on real gene expression data to learn the genetic regulatory networks of two organisms with homologous genes.  相似文献   

10.
Coexpression of genes or, more generally, similarity in the expression profiles poses an unsurmountable obstacle to inferring the gene regulatory network (GRN) based solely on data from DNA microarray time series. Clustering of genes with similar expression profiles allows for a course-grained view of the GRN and a probabilistic determination of the connectivity among the clusters. We present a model for the temporal evolution of a gene cluster network which takes into account interactions of gene products with genes and, through a non-constant degradation rate, with other gene products. The number of model parameters is reduced by using polynomial functions to interpolate temporal data points. In this manner, the task of parameter estimation is reduced to a system of linear algebraic equations, thus making the computation time shorter by orders of magnitude. To eliminate irrelevant networks, we test each GRN for stability with respect to parameter variations, and impose restrictions on its behavior near the steady state. We apply our model and methods to DNA microarray time series' data collected on Escherichia coli during glucose-lactose diauxie and infer the most probable cluster network for different phases of the experiment. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11693-011-9079-2) contains supplementary material, which is available to authorized users.  相似文献   

11.
We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data.  相似文献   

12.
In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data isvery important to understand the underlying biological system,and it has been a challenging task in bioinformatics.TheBayesian network model has been used in reconstructing the gene regulatory network for its advantages,but how to determinethe network structure and parameters is still important to be explored.This paper proposes a two-stage structure learning algorithmwhich integrates immune evolution algorithm to build a Bayesian network.The new algorithm is evaluated with the use ofboth simulated and yeast cell cycle data.The experimental results indicate that the proposed algorithm can find many of theknown real regulatory relationships from literature and predict the others unknown with high validity and accuracy.  相似文献   

13.
We investigate in this paper reverse engineering of gene regulatory networks from time-series microarray data. We apply dynamic Bayesian networks (DBNs) for modeling cell cycle regulations. In developing a network inference algorithm, we focus on soft solutions that can provide a posteriori probability (APP) of network topology. In particular, we propose a variational Bayesian structural expectation maximization algorithm that can learn the posterior distribution of the network model parameters and topology jointly. We also show how the obtained APPs of the network topology can be used in a Bayesian data integration strategy to integrate two different microarray data sets. The proposed VBSEM algorithm has been tested on yeast cell cycle data sets. To evaluate the confidence of the inferred networks, we apply a moving block bootstrap method. The inferred network is validated by comparing it to the KEGG pathway map.  相似文献   

14.
Genetic regulatory network inference is critically important for revealing fundamental cellular processes, investigating gene functions, and understanding their relations. The availability of time series gene expression data makes it possible to investigate the gene activities of whole genomes, rather than those of only a pair of genes or among several genes. However, current computational methods do not sufficiently consider the temporal behavior of this type of data and lack the capability to capture the complex nonlinear system dynamics. We propose a recurrent neural network (RNN) and particle swarm optimization (PSO) approach to infer genetic regulatory networks from time series gene expression data. Under this framework, gene interaction is explained through a connection weight matrix. Based on the fact that the measured time points are limited and the assumption that the genetic networks are usually sparsely connected, we present a PSO-based search algorithm to unveil potential genetic network constructions that fit well with the time series data and explore possible gene interactions. Furthermore, PSO is used to train the RNN and determine the network parameters. Our approach has been applied to both synthetic and real data sets. The results demonstrate that the RNN/PSO can provide meaningful insights in understanding the nonlinear dynamics of the gene expression time series and revealing potential regulatory interactions between genes.  相似文献   

15.
In order to identify genes involved in complex diseases, it is crucial to study the genetic interactions at the systems biology level. By utilizing modern high throughput microarray technology, it has become feasible to obtain gene expressions data and turn it into knowledge that explains the regulatory behavior of genes. In this study, an unsupervised nonlinear model was proposed to infer gene regulatory networks on a genome-wide scale. The proposed model consists of two components, a robust correlation estimator and a nonlinear recurrent model. The robust correlation estimator was used to initialize the parameters of the nonlinear recurrent curve-fitting model. Then the initialized model was used to fit the microarray data. The model was used to simulate the underlying nonlinear regulatory mechanisms in biological organisms. The proposed algorithm was applied to infer the regulatory mechanisms of the general network in Saccharomyces cerevisiae and the pulmonary disease pathways in Homo sapiens. The proposed algorithm requires no prior biological knowledge to predict linkages between genes. The prediction results were checked against true positive links obtained from the YEASTRACT database, the TRANSFAC database, and the KEGG database. By checking the results with known interactions, we showed that the proposed algorithm could determine some meaningful pathways, many of which are supported by the existing literature.  相似文献   

16.
Improving the ability to reverse engineer biochemical networks is a major goal of systems biology. Lesions in signaling networks lead to alterations in gene expression, which in principle should allow network reconstruction. However, the information about the activity levels of signaling proteins conveyed in overall gene expression is limited by the complexity of gene expression dynamics and of regulatory network topology. Two observations provide the basis for overcoming this limitation: a. genes induced without de-novo protein synthesis (early genes) show a linear accumulation of product in the first hour after the change in the cell''s state; b. The signaling components in the network largely function in the linear range of their stimulus-response curves. Therefore, unlike most genes or most time points, expression profiles of early genes at an early time point provide direct biochemical assays that represent the activity levels of upstream signaling components. Such expression data provide the basis for an efficient algorithm (Plato''s Cave algorithm; PLACA) to reverse engineer functional signaling networks. Unlike conventional reverse engineering algorithms that use steady state values, PLACA uses stimulated early gene expression measurements associated with systematic perturbations of signaling components, without measuring the signaling components themselves. Besides the reverse engineered network, PLACA also identifies the genes detecting the functional interaction, thereby facilitating validation of the predicted functional network. Using simulated datasets, the algorithm is shown to be robust to experimental noise. Using experimental data obtained from gonadotropes, PLACA reverse engineered the interaction network of six perturbed signaling components. The network recapitulated many known interactions and identified novel functional interactions that were validated by further experiment. PLACA uses the results of experiments that are feasible for any signaling network to predict the functional topology of the network and to identify novel relationships.  相似文献   

17.
18.
Since measurements of process variables are subject to measurements errors as well as process variability, data reconciliation is the procedure of optimally adjusting measured date so that the adjusted values obey the conservation laws and constraints. Thus, data reconciliation for dynamic systems is fundamental and important for control, fault detection, and system optimization. Attempts to successfully implement estimators are often hindered by serve process nonlinearities, complicated state constraints, and un-measurable perturbations. As a constrained minimization problem, the dynamic data reconciliation is dynamically carried out to product smoothed estimates with variances from the original data. Many algorithms are proposed to solve such state estimation such as the extended Kalman filter (EKF), the unscented Kalman filter, and the cubature Kalman filter (CKF). In this paper, we investigate the use of CKF algorithm in comparative with the EKF to solve the nonlinear dynamic data reconciliation problem. First we give a broad overview of the recursive nonlinear data dynamic reconciliation (RNDDR) scheme, then present an extension to the CKF algorithm, and finally address the issue of how to solve the constraints in the CKF approach. The CCRNDDR method is proposed by applying the RNDDR in the CKF algorithm to handle nonlinearity and algebraic constraints and bounds. As the sampling idea is incorporated into the RNDDR framework, more accurate estimates can obtained via the recursive nature of the estimation procedure. The performance of the CKF approach is compared with EKF and RNDDR on nonlinear process systems with constraints. The conclusion is that with an error optimization solution of the correction step, the reformulated CKF shows high performance on the selection of nonlinear constrained process systems. Simulation results show the CCRNDDR is an efficient, accurate and stable method for real-time state estimation for nonlinear dynamic processes.  相似文献   

19.
It is system dynamics that determines the function of cells, tissues and organisms. To develop mathematical models and estimate their parameters are an essential issue for studying dynamic behaviors of biological systems which include metabolic networks, genetic regulatory networks and signal transduction pathways, under perturbation of external stimuli. In general, biological dynamic systems are partially observed. Therefore, a natural way to model dynamic biological systems is to employ nonlinear state-space equations. Although statistical methods for parameter estimation of linear models in biological dynamic systems have been developed intensively in the recent years, the estimation of both states and parameters of nonlinear dynamic systems remains a challenging task. In this report, we apply extended Kalman Filter (EKF) to the estimation of both states and parameters of nonlinear state-space models. To evaluate the performance of the EKF for parameter estimation, we apply the EKF to a simulation dataset and two real datasets: JAK-STAT signal transduction pathway and Ras/Raf/MEK/ERK signaling transduction pathways datasets. The preliminary results show that EKF can accurately estimate the parameters and predict states in nonlinear state-space equations for modeling dynamic biochemical networks.  相似文献   

20.
Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to understanding the complex interactions involved in cellular behavior and molecular physiology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号