首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Chan ZS  Collins L  Kasabov N 《Bio Systems》2007,87(2-3):299-306
Differential equations (DEs) have been the most widespread formalism for gene regulatory network (GRN) modeling, as they offer natural interpretation of biological processes, easy elucidation of gene relationships, and the capability of using efficient parameter estimation methods. However, an important limitation of DEs is their requirement of O(d(2)) parameters where d is the number of genes modeled, which often causes over-parameterization for large d, leading to the over-fitting of data and dense parameter sets that are hard to interpret. This paper presents the first effort to address the over-parameterization problem by applying the sparse Bayesian learning (SBL) method to sparsify the GRN model of DEs. SBL operates on the parsimony principle, with the objective to reduce the number of effective parameters by driving the redundant parameters to zero. The resulting sparse parameter set offers three important advantages for GRN inference: first, the inferred GRNs are more plausible, since the biological counterparts are known to be sparse; second, gene relationships can be more easily elucidated from sparse sets than from dense sets; and third, the solutions become more optimal and consistent, due to the reduction in the volume of solution space. Experiments are conducted on the yeast Saccharomyces cerevisiae time-series gene expression data, in which known regulatory events related to the cell cycle G1/S phase are reliably reproduced.  相似文献   

2.
Microarray gene expression data can provide insights into biological processes at a system-wide level and is commonly used for reverse engineering gene regulatory networks (GRN). Due to the amalgamation of noise from different sources, microarray expression profiles become inherently noisy leading to significant impact on the GRN reconstruction process. Microarray replicates (both biological and technical), generated to increase the reliability of data obtained under noisy conditions, have limited influence in enhancing the accuracy of reconstruction . Therefore, instead of the conventional GRN modeling approaches which are deterministic, stochastic techniques are becoming increasingly necessary for inferring GRN from noisy microarray data. In this paper, we propose a new stochastic GRN model by investigating incorporation of various standard noise measurements in the deterministic S-system model. Experimental evaluations performed for varying sizes of synthetic network, representing different stochastic processes, demonstrate the effect of noise on the accuracy of genetic network modeling and the significance of stochastic modeling for GRN reconstruction . The proposed stochastic model is subsequently applied to infer the regulations among genes in two real life networks: (1) the well-studied IRMA network, a real-life in-vivo synthetic network constructed within the Saccharomycescerevisiae yeast, and (2) the SOS DNA repair network in Escherichiacoli.  相似文献   

3.
Reconstruction of genetic regulatory networks from time series data of gene expression patterns is an important research topic in bioinformatics. Probabilistic Boolean Networks (PBNs) have been proposed as an effective model for gene regulatory networks. PBNs are able to cope with uncertainty, corporate rule-based dependencies between genes and discover the sensitivity of genes in their interactions with other genes. However, PBNs are unlikely to use directly in practice because of huge amount of computational cost for obtaining predictors and their corresponding probabilities. In this paper, we propose a multivariate Markov model for approximating PBNs and describing the dynamics of a genetic network for gene expression sequences. The main contribution of the new model is to preserve the strength of PBNs and reduce the complexity of the networks. The number of parameters of our proposed model is O(n2) where n is the number of genes involved. We also develop efficient estimation methods for solving the model parameters. Numerical examples on synthetic data sets and practical yeast data sequences are given to demonstrate the effectiveness of the proposed model.  相似文献   

4.
5.
目的:基因调控网络在药物研发与疾病防治方面有重要的生物学意义。目前基于芯片数据构建网络的方法普遍效率不高,准确度较低,为此提出了一种新的高效调控网络结构预测算法。方法:提出了一种基于贪婪等价搜索机制的遗传算法构建基因调控网络模型。通过引入遗传算法的多点并行性,使得算法易于摆脱局部最优。通过编码网络结构作为遗传算法的染色体和设计基于GES机制的变异算子,使网络的进化过程基于马尔科夫等价空间而不是有向无环图空间。结果:通过对标准网络ASIA和酵母调控网络的预测,与近期Xue-wen Chen等提出的Order K2算法进行了比较,在网络构建准确率上获得了更佳的结果。与标准遗传算法比较下在执行效率上大大提高。结论:提出的算法在网络结构预测准确率上相对于最近提出的Order K2算法在准确率上效果更佳,并且相较标准遗传算法网络在进化过程上效率更高。  相似文献   

6.
Gene regulatory networks (GRNs) are complex biological systems that have a large impact on protein levels, so that discovering network interactions is a major objective of systems biology. Quantitative GRN models have been inferred, to date, from time series measurements of gene expression, but at small scale, and with limited application to real data. Time series experiments are typically short (number of time points of the order of ten), whereas regulatory networks can be very large (containing hundreds of genes). This creates an under-determination problem, which negatively influences the results of any inferential algorithm. Presented here is an integrative approach to model inference, which has not been previously discussed to the authors' knowledge. Multiple heterogeneous expression time series are used to infer the same model, and results are shown to be more robust to noise and parameter perturbation. Additionally, a wavelet analysis shows that these models display limited noise over-fitting within the individual datasets.  相似文献   

7.
We investigate in this paper reverse engineering of gene regulatory networks from time-series microarray data. We apply dynamic Bayesian networks (DBNs) for modeling cell cycle regulations. In developing a network inference algorithm, we focus on soft solutions that can provide a posteriori probability (APP) of network topology. In particular, we propose a variational Bayesian structural expectation maximization algorithm that can learn the posterior distribution of the network model parameters and topology jointly. We also show how the obtained APPs of the network topology can be used in a Bayesian data integration strategy to integrate two different microarray data sets. The proposed VBSEM algorithm has been tested on yeast cell cycle data sets. To evaluate the confidence of the inferred networks, we apply a moving block bootstrap method. The inferred network is validated by comparing it to the KEGG pathway map.  相似文献   

8.
9.
One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene) is predicted from the expression patterns of all the other genes (input genes), using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn''t make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.  相似文献   

10.
MOTIVATION: Bayesian networks have been applied to infer genetic regulatory interactions from microarray gene expression data. This inference problem is particularly hard in that interactions between hundreds of genes have to be learned from very small data sets, typically containing only a few dozen time points during a cell cycle. Most previous studies have assessed the inference results on real gene expression data by comparing predicted genetic regulatory interactions with those known from the biological literature. This approach is controversial due to the absence of known gold standards, which renders the estimation of the sensitivity and specificity, that is, the true and (complementary) false detection rate, unreliable and difficult. The objective of the present study is to test the viability of the Bayesian network paradigm in a realistic simulation study. First, gene expression data are simulated from a realistic biological network involving DNAs, mRNAs, inactive protein monomers and active protein dimers. Then, interaction networks are inferred from these data in a reverse engineering approach, using Bayesian networks and Bayesian learning with Markov chain Monte Carlo. RESULTS: The simulation results are presented as receiver operator characteristics curves. This allows estimating the proportion of spurious gene interactions incurred for a specified target proportion of recovered true interactions. The findings demonstrate how the network inference performance varies with the training set size, the degree of inadequacy of prior assumptions, the experimental sampling strategy and the inclusion of further, sequence-based information. AVAILABILITY: The programs and data used in the present study are available from http://www.bioss.sari.ac.uk/~dirk/Supplements  相似文献   

11.
12.
MOTIVATION: Biological processes in cells are properly performed by gene regulations, signal transductions and interactions between proteins. To understand such molecular networks, we propose a statistical method to estimate gene regulatory networks and protein-protein interaction networks simultaneously from DNA microarray data, protein-protein interaction data and other genome-wide data. RESULTS: We unify Bayesian networks and Markov networks for estimating gene regulatory networks and protein-protein interaction networks according to the reliability of each biological information source. Through the simultaneous construction of gene regulatory networks and protein-protein interaction networks of Saccharomyces cerevisiae cell cycle, we predict the role of several genes whose functions are currently unknown. By using our probabilistic model, we can detect false positives of high-throughput data, such as yeast two-hybrid data. In a genome-wide experiment, we find possible gene regulatory relationships and protein-protein interactions between large protein complexes that underlie complex regulatory mechanisms of biological processes.  相似文献   

13.
A gene regulatory network (GRN) represents a set of genes and its regulatory interactions. The inference of the regulatory interactions between genes is usually carried out using an appropriate mathematical model and the available gene expression profile. Among the various models proposed for GRN inference, our recently proposed Michaelis–Menten based ODE model provides a good trade-off between the computational complexity and biological relevance. This model, like other known GRN models, also uses an evolutionary algorithm for parameter estimation. Considering various issues associated with such population based stochastic optimization approaches (e.g. diversity, premature convergence due to local optima, accuracy, etc.), it becomes important to seed the initial population with good individuals which are closer to the optimal solution. In this paper, we exploit the inherent strength of principal component analysis (PCA) in a novel manner to initialize the population for GRN optimization. The benefit of the proposed method is validated by reconstructing in silico and in vivo networks of various sizes. For the same level of accuracy, the approach with PCA based initialization shows improved convergence speed.  相似文献   

14.
The axial bodyplan of Drosophila melanogaster is determined during a process called morphogenesis. Shortly after fertilization, maternal bicoid mRNA is translated into Bicoid (Bcd). This protein establishes a spatially graded morphogen distribution along the anterior-posterior (AP) axis of the embryo. Bcd initiates AP axis determination by triggering expression of gap genes that subsequently regulate each other's expression to form a precisely controlled spatial distribution of gene products. Reaction-diffusion models of gap gene expression on a 1D domain have previously been used to infer complex genetic regulatory network (GRN) interactions by optimizing model parameters with respect to 1D gap gene expression data. Here we construct a finite element reaction-diffusion model with a realistic 3D geometry fit to full 3D gap gene expression data. Though gap gene products exhibit dorsal-ventral asymmetries, we discover that previously inferred gap GRNs yield qualitatively correct AP distributions on the 3D domain only when DV-symmetric initial conditions are employed. Model patterning loses qualitative agreement with experimental data when we incorporate a realistic DV-asymmetric distribution of Bcd. Further, we find that geometry alone is insufficient to account for DV-asymmetries in the final gap gene distribution. Additional GRN optimization confirms that the 3D model remains sensitive to GRN parameter perturbations. Finally, we find that incorporation of 3D data in simulation and optimization does not constrain the search space or improve optimization results.  相似文献   

15.
肝癌基因调控网络研究进展   总被引:1,自引:0,他引:1  
刘湘琼  连保峰  林勇 《生物工程学报》2016,32(10):1322-1331
肝癌(Hepatocellular carcinoma,HCC)是我国常见的恶性肿瘤之一。肝癌基因调控网络(HCC regulatory network,HCC GRN)是研究肝癌分子机制的重要途径之一,其节点包括肝癌相关的分子,如mi RNA、TF等,网络的边由节点间相互作用关系构成。基于不同类型的数据构建的肝癌基因调控网络其类型及特征各有不同。综合近年来肝癌基因调控网络研究发现,由TF与mi RNA构建的肝癌转录调控网络更能揭露肝癌关键基因,反映关键基因在调控网络中的扰动情况。整合基因变异信息与调控网络成为研究肝癌基因调控网络的趋势,但相应的研究几乎是空白的。本文从HCC GRN的数据来源、分类及特征,及各类型调控网络的近年研究情况等方面进行综述,并结合相关研究工作对肝癌基因调控网络研究现状进行分析与讨论,对前景进行展望,为这一领域研究工作提供参考。  相似文献   

16.
We present a memetic algorithm for evolving the structure of biomolecular interactions and inferring the effective kinetic parameters from the time series data of gene expression using the decoupled Ssystem formalism. We propose an Information Criteria based fitness evaluation for gene network model selection instead of the conventional Mean Squared Error (MSE) based fitness evaluation. A hill-climbing local-search method has been incorporated in our evolutionary algorithm for efficiently attaining the skeletal architecture which is most frequently observed in biological networks. The suitability of the method is tested in gene circuit reconstruction experiments, varying the network dimension and/or characteristics, the amount of gene expression data used for inference and the noise level present in expression profiles. The reconstruction method inferred the network topology and the regulatory parameters with high accuracy. Nevertheless, the performance is limited to the amount of expression data used and the noise level present in the data. The proposed fitness function has been found more suitable for identifying correct network topology and for estimating the accurate parameter values compared to the existing ones. Finally, we applied the methodology for analyzing the cell-cycle gene expression data of budding yeast and reconstructed the network of some key regulators.  相似文献   

17.
ABSTRACT: BACKGROUND: Understanding gene interactions is a fundamental question in systems biology. Currently, modeling of gene regulations using the Bayesian Network (BN) formalism assumes that genes interact either instantaneously or with a certain amount of time delay. However in reality, biological regulations, both instantaneous and time-delayed, occur simultaneously. A framework that can detect and model both these two types of interactions simultaneously would represent gene regulatory networks more accurately. RESULTS: In this paper, we introduce a framework based on the Bayesian Network (BN) formalism that can represent both instantaneous and time-delayed interactions between genes simultaneously. A novel scoring metric having rm mathematical underpinnings is also proposed that, unlike other recent methods, can score both interactions concurrently and takes into account the reality that multiple regulators can regulate a gene jointly, rather than in an isolated pair-wise manner. Further, a gene regulatory network (GRN) inference method employing an evolutionary search that makes use of the framework and the scoring metric is also presented. CONCLUSION: By taking into consideration the biological fact that both instantaneous and time-delayed regulations can occur among genes, our approach models gene interactions with greater accuracy. The proposed framework is efcient and can be used to infer gene networkshaving multiple orders of instantaneous and time-delayed regulations simultaneously. Experiments are carried out using three different synthetic networks (with three different mechanisms for generating synthetic data) as well as real life networks of Saccharomyces cerevisiae, E. coli and cyanobacteria gene expression data. The results show the effectiveness of our approach.  相似文献   

18.
A Boolean network is a graphical model for representing and analyzing the behavior of gene regulatory networks (GRN). In this context, the accurate and efficient reconstruction of a Boolean network is essential for understanding the gene regulation mechanism and the complex relations that exist therein. In this paper we introduce an elegant and efficient algorithm for the reverse engineering of Boolean networks from a time series of multivariate binary data corresponding to gene expression data. We call our method ReBMM, i.e., reverse engineering based on Bernoulli mixture models. The time complexity of most of the existing reverse engineering techniques is quite high and depends upon the indegree of a node in the network. Due to the high complexity of these methods, they can only be applied to sparsely connected networks of small sizes. ReBMM has a time complexity factor, which is independent of the indegree of a node and is quadratic in the number of nodes in the network, a big improvement over other techniques and yet there is little or no compromise in accuracy. We have tested ReBMM on a number of artificial datasets along with simulated data derived from a plant signaling network. We also used this method to reconstruct a network from real experimental observations of microarray data of the yeast cell cycle. Our method provides a natural framework for generating rules from a probabilistic model. It is simple, intuitive and illustrates excellent empirical results.  相似文献   

19.
Coexpression of genes or, more generally, similarity in the expression profiles poses an unsurmountable obstacle to inferring the gene regulatory network (GRN) based solely on data from DNA microarray time series. Clustering of genes with similar expression profiles allows for a course-grained view of the GRN and a probabilistic determination of the connectivity among the clusters. We present a model for the temporal evolution of a gene cluster network which takes into account interactions of gene products with genes and, through a non-constant degradation rate, with other gene products. The number of model parameters is reduced by using polynomial functions to interpolate temporal data points. In this manner, the task of parameter estimation is reduced to a system of linear algebraic equations, thus making the computation time shorter by orders of magnitude. To eliminate irrelevant networks, we test each GRN for stability with respect to parameter variations, and impose restrictions on its behavior near the steady state. We apply our model and methods to DNA microarray time series' data collected on Escherichia coli during glucose-lactose diauxie and infer the most probable cluster network for different phases of the experiment. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11693-011-9079-2) contains supplementary material, which is available to authorized users.  相似文献   

20.

Background

Gene expression time series data are usually in the form of high-dimensional arrays. Unfortunately, the data may sometimes contain missing values: for either the expression values of some genes at some time points or the entire expression values of a single time point or some sets of consecutive time points. This significantly affects the performance of many algorithms for gene expression analysis that take as an input, the complete matrix of gene expression measurement. For instance, previous works have shown that gene regulatory interactions can be estimated from the complete matrix of gene expression measurement. Yet, till date, few algorithms have been proposed for the inference of gene regulatory network from gene expression data with missing values.

Results

We describe a nonlinear dynamic stochastic model for the evolution of gene expression. The model captures the structural, dynamical, and the nonlinear natures of the underlying biomolecular systems. We present point-based Gaussian approximation (PBGA) filters for joint state and parameter estimation of the system with one-step or two-step missing measurements. The PBGA filters use Gaussian approximation and various quadrature rules, such as the unscented transform (UT), the third-degree cubature rule and the central difference rule for computing the related posteriors. The proposed algorithm is evaluated with satisfying results for synthetic networks, in silico networks released as a part of the DREAM project, and the real biological network, the in vivo reverse engineering and modeling assessment (IRMA) network of yeast Saccharomyces cerevisiae.

Conclusion

PBGA filters are proposed to elucidate the underlying gene regulatory network (GRN) from time series gene expression data that contain missing values. In our state-space model, we proposed a measurement model that incorporates the effect of the missing data points into the sequential algorithm. This approach produces a better inference of the model parameters and hence, more accurate prediction of the underlying GRN compared to when using the conventional Gaussian approximation (GA) filters ignoring the missing data points.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号