首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Biological systems are traditionally studied by focusing on a specific subsystem, building an intuitive model for it, and refining the model using results from carefully designed experiments. Modern experimental techniques provide massive data on the global behavior of biological systems, and systematically using these large datasets for refining existing knowledge is a major challenge. Here we introduce an extended computational framework that combines formalization of existing qualitative models, probabilistic modeling, and integration of high-throughput experimental data. Using our methods, it is possible to interpret genomewide measurements in the context of prior knowledge on the system, to assign statistical meaning to the accuracy of such knowledge, and to learn refined models with improved fit to the experiments. Our model is represented as a probabilistic factor graph, and the framework accommodates partial measurements of diverse biological elements. We study the performance of several probabilistic inference algorithms and show that hidden model variables can be reliably inferred even in the presence of feedback loops and complex logic. We show how to refine prior knowledge on combinatorial regulatory relations using hypothesis testing and derive p-values for learned model features. We test our methodology and algorithms on a simulated model and on two real yeast models. In particular, we use our method to explore uncharacterized relations among regulators in the yeast response to hyper-osmotic shock and in the yeast lysine biosynthesis system. Our integrative approach to the analysis of biological regulation is demonstrated to synergistically combine qualitative and quantitative evidence into concrete biological predictions.  相似文献   

3.
4.
Computational modeling is being used increasingly in neuroscience. In deriving such models, inference issues such as model selection, model complexity, and model comparison must be addressed constantly. In this article we present briefly the Bayesian approach to inference. Under a simple set of commonsense axioms, there exists essentially a unique way of reasoning under uncertainty by assigning a degree of confidence to any hypothesis or model, given the available data and prior information. Such degrees of confidence must obey all the rules governing probabilities and can be updated accordingly as more data becomes available. While the Bayesian methodology can be applied to any type of model, as an example we outline its use for an important, and increasingly standard, class of models in computational neuroscience—compartmental models of single neurons. Inference issues are particularly relevant for these models: their parameter spaces are typically very large, neurophysiological and neuroanatomical data are still sparse, and probabilistic aspects are often ignored. As a tutorial, we demonstrate the Bayesian approach on a class of one-compartment models with varying numbers of conductances. We then apply Bayesian methods on a compartmental model of a real neuron to determine the optimal amount of noise to add to the model to give it a level of spike time variability comparable to that found in the real cell.  相似文献   

5.
One of the most important goals of biological investigation is to uncover gene functional relations. In this study we propose a framework for extraction and integration of gene functional relations from diverse biological data sources, including gene expression data, biological literature and genomic sequence information. We introduce a two-layered Bayesian network approach to integrate relations from multiple sources into a genome-wide functional network. An experimental study was conducted on a test-bed of Arabidopsis thaliana. Evaluation of the integrated network demonstrated that relation integration could improve the reliability of relations by combining evidence from different data sources. Domain expert judgments on the gene functional clusters in the network confirmed the validity of our approach for relation integration and network inference.  相似文献   

6.
We introduce here the concept of Implicit networks which provide, like Bayesian networks, a graphical modelling framework that encodes the joint probability distribution for a set of random variables within a directed acyclic graph. We show that Implicit networks, when used in conjunction with appropriate statistical techniques, are very attractive for their ability to understand and analyze biological data. Particularly, we consider here the use of Implicit networks for causal inference in biomolecular pathways. In such pathways, an Implicit network encodes dependencies among variables (proteins, genes), can be trained to learn causal relationships (regulation, interaction) between them and then used to predict the biological response given the status of some key proteins or genes in the network. We show that Implicit networks offer efficient methodologies for learning from observations without prior knowledge and thus provide a good alternative to classical inference in Bayesian networks when priors are missing. We illustrate our approach by an application to simulated data for a simplified signal transduction pathway of the epidermal growth factor receptor (EGFR) protein.  相似文献   

7.
The problem of reconstructing large-scale, gene regulatory networks from gene expression data has garnered considerable attention in bioinformatics over the past decade with the graphical modeling paradigm having emerged as a popular framework for inference. Analysis in a full Bayesian setting is contingent upon the assignment of a so-called structure prior-a probability distribution on networks, encoding a priori biological knowledge either in the form of supplemental data or high-level topological features. A key topological consideration is that a wide range of cellular networks are approximately scale-free, meaning that the fraction, , of nodes in a network with degree is roughly described by a power-law with exponent between and . The standard practice, however, is to utilize a random structure prior, which favors networks with binomially distributed degree distributions. In this paper, we introduce a scale-free structure prior for graphical models based on the formula for the probability of a network under a simple scale-free network model. Unlike the random structure prior, its scale-free counterpart requires a node labeling as a parameter. In order to use this prior for large-scale network inference, we design a novel Metropolis-Hastings sampler for graphical models that includes a node labeling as a state space variable. In a simulation study, we demonstrate that the scale-free structure prior outperforms the random structure prior at recovering scale-free networks while at the same time retains the ability to recover random networks. We then estimate a gene association network from gene expression data taken from a breast cancer tumor study, showing that scale-free structure prior recovers hubs, including the previously unknown hub SLC39A6, which is a zinc transporter that has been implicated with the spread of breast cancer to the lymph nodes. Our analysis of the breast cancer expression data underscores the value of the scale-free structure prior as an instrument to aid in the identification of candidate hub genes with the potential to direct the hypotheses of molecular biologists, and thus drive future experiments.  相似文献   

8.
An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows ("explaining away") and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons.  相似文献   

9.
ABSTRACT: BACKGROUND: Inference about regulatory networks from high-throughput genomics data is of great interest in systems biology. We present a Bayesian approach to infer gene regulatory networks from time series expression data by integrating various types of biological knowledge. RESULTS: We formulate network construction as a series of variable selection problems and use linear regression to model the data. Our method summarizes additional data sources with an informative prior probability distribution over candidate regression models. We extend the Bayesian model averaging (BMA) variable selection method to select regulators in the regression framework. We summarize the external biological knowledge by an informative prior probability distribution over the candidate regression models. CONCLUSIONS: We demonstrate our method on simulated data and a set of time-series microarray experiments measuring the effect of a drug perturbation on gene expression levels, and show that it outperforms leading regression-based methods in the literature.  相似文献   

10.
Hierarchical generative models, such as Bayesian networks, and belief propagation have been shown to provide a theoretical framework that can account for perceptual processes, including feedforward recognition and feedback modulation. The framework explains both psychophysical and physiological experimental data and maps well onto the hierarchical distributed cortical anatomy. However, the complexity required to model cortical processes makes inference, even using approximate methods, very computationally expensive. Thus, existing object perception models based on this approach are typically limited to tree-structured networks with no loops, use small toy examples or fail to account for certain perceptual aspects such as invariance to transformations or feedback reconstruction. In this study we develop a Bayesian network with an architecture similar to that of HMAX, a biologically-inspired hierarchical model of object recognition, and use loopy belief propagation to approximate the model operations (selectivity and invariance). Crucially, the resulting Bayesian network extends the functionality of HMAX by including top-down recursive feedback. Thus, the proposed model not only achieves successful feedforward recognition invariant to noise, occlusions, and changes in position and size, but is also able to reproduce modulatory effects such as illusory contour completion and attention. Our novel and rigorous methodology covers key aspects such as learning using a layerwise greedy algorithm, combining feedback information from multiple parents and reducing the number of operations required. Overall, this work extends an established model of object recognition to include high-level feedback modulation, based on state-of-the-art probabilistic approaches. The methodology employed, consistent with evidence from the visual cortex, can be potentially generalized to build models of hierarchical perceptual organization that include top-down and bottom-up interactions, for example, in other sensory modalities.  相似文献   

11.
Humans have been shown to combine noisy sensory information with previous experience (priors), in qualitative and sometimes quantitative agreement with the statistically-optimal predictions of Bayesian integration. However, when the prior distribution becomes more complex than a simple Gaussian, such as skewed or bimodal, training takes much longer and performance appears suboptimal. It is unclear whether such suboptimality arises from an imprecise internal representation of the complex prior, or from additional constraints in performing probabilistic computations on complex distributions, even when accurately represented. Here we probe the sources of suboptimality in probabilistic inference using a novel estimation task in which subjects are exposed to an explicitly provided distribution, thereby removing the need to remember the prior. Subjects had to estimate the location of a target given a noisy cue and a visual representation of the prior probability density over locations, which changed on each trial. Different classes of priors were examined (Gaussian, unimodal, bimodal). Subjects'' performance was in qualitative agreement with the predictions of Bayesian Decision Theory although generally suboptimal. The degree of suboptimality was modulated by statistical features of the priors but was largely independent of the class of the prior and level of noise in the cue, suggesting that suboptimality in dealing with complex statistical features, such as bimodality, may be due to a problem of acquiring the priors rather than computing with them. We performed a factorial model comparison across a large set of Bayesian observer models to identify additional sources of noise and suboptimality. Our analysis rejects several models of stochastic behavior, including probability matching and sample-averaging strategies. Instead we show that subjects'' response variability was mainly driven by a combination of a noisy estimation of the parameters of the priors, and by variability in the decision process, which we represent as a noisy or stochastic posterior.  相似文献   

12.
《Biophysical journal》2023,122(2):433-441
Potential energy landscapes are useful models in describing events such as protein folding and binding. While single-molecule fluorescence resonance energy transfer (smFRET) experiments encode information on continuous potentials for the system probed, including rarely visited barriers between putative potential minima, this information is rarely decoded from the data. This is because existing analysis methods often model smFRET output assuming, from the onset, that the system probed evolves in a discretized state space to be analyzed within a hidden Markov model (HMM) paradigm. By contrast, here, we infer continuous potentials from smFRET data without discretely approximating the state space. We do so by operating within a Bayesian nonparametric paradigm by placing priors on the family of all possible potential curves. As our inference accounts for a number of required experimental features raising computational cost (such as incorporating discrete photon shot noise), the framework leverages a structured-kernel-interpolation Gaussian process prior to help curtail computational cost. We show that our structured-kernel-interpolation priors for potential energy reconstruction from smFRET analysis accurately infers the potential energy landscape from a smFRET binding experiment. We then illustrate advantages of structured-kernel-interpolation priors for potential energy reconstruction from smFRET over standard HMM approaches by providing information, such as barrier heights and friction coefficients, that is otherwise inaccessible to HMMs.  相似文献   

13.
Bayesian inference has emerged as a general framework that captures how organisms make decisions under uncertainty. Recent experimental findings reveal disparate mechanisms for how the brain generates behaviors predicted by normative Bayesian theories. Here, we identify two broad classes of neural implementations for Bayesian inference: a modular class, where each probabilistic component of Bayesian computation is independently encoded and a transform class, where uncertain measurements are converted to Bayesian estimates through latent processes. Many recent experimental neuroscience findings studying probabilistic inference broadly fall into these classes. We identify potential avenues for synthesis across these two classes and the disparities that, at present, cannot be reconciled. We conclude that to distinguish among implementation hypotheses for Bayesian inference, we require greater engagement among theoretical and experimental neuroscientists in an effort that spans different scales of analysis, circuits, tasks, and species.  相似文献   

14.
15.
16.
MOTIVATION: The task of engineering a protein to perform a target biological function is known as protein design. A commonly used paradigm casts this functional design problem as a structural one, assuming a fixed backbone. In probabilistic protein design, positional amino acid probabilities are used to create a random library of sequences to be simultaneously screened for biological activity. Clearly, certain choices of probability distributions will be more successful in yielding functional sequences. However, since the number of sequences is exponential in protein length, computational optimization of the distribution is difficult. RESULTS: In this paper, we develop a computational framework for probabilistic protein design following the structural paradigm. We formulate the distribution of sequences for a structure using the Boltzmann distribution over their free energies. The corresponding probabilistic graphical model is constructed, and we apply belief propagation (BP) to calculate marginal amino acid probabilities. We test this method on a large structural dataset and demonstrate the superiority of BP over previous methods. Nevertheless, since the results obtained by BP are far from optimal, we thoroughly assess the paradigm using high-quality experimental data. We demonstrate that, for small scale sub-problems, BP attains identical results to those produced by exact inference on the paradigmatic model. However, quantitative analysis shows that the distributions predicted significantly differ from the experimental data. These findings, along with the excellent performance we observed using BP on the smaller problems, suggest potential shortcomings of the paradigm. We conclude with a discussion of how it may be improved in the future.  相似文献   

17.
Miller MA  Feng XJ  Li G  Rabitz HA 《PloS one》2012,7(6):e37664
This work presents an adapted Random Sampling - High Dimensional Model Representation (RS-HDMR) algorithm for synergistically addressing three key problems in network biology: (1) identifying the structure of biological networks from multivariate data, (2) predicting network response under previously unsampled conditions, and (3) inferring experimental perturbations based on the observed network state. RS-HDMR is a multivariate regression method that decomposes network interactions into a hierarchy of non-linear component functions. Sensitivity analysis based on these functions provides a clear physical and statistical interpretation of the underlying network structure. The advantages of RS-HDMR include efficient extraction of nonlinear and cooperative network relationships without resorting to discretization, prediction of network behavior without mechanistic modeling, robustness to data noise, and favorable scalability of the sampling requirement with respect to network size. As a proof-of-principle study, RS-HDMR was applied to experimental data measuring the single-cell response of a protein-protein signaling network to various experimental perturbations. A comparison to network structure identified in the literature and through other inference methods, including Bayesian and mutual-information based algorithms, suggests that RS-HDMR can successfully reveal a network structure with a low false positive rate while still capturing non-linear and cooperative interactions. RS-HDMR identified several higher-order network interactions that correspond to known feedback regulations among multiple network species and that were unidentified by other network inference methods. Furthermore, RS-HDMR has a better ability to predict network response under unsampled conditions in this application than the best statistical inference algorithm presented in the recent DREAM3 signaling-prediction competition. RS-HDMR can discern and predict differences in network state that arise from sources ranging from intrinsic cell-cell variability to altered experimental conditions, such as when drug perturbations are introduced. This ability ultimately allows RS-HDMR to accurately classify the experimental conditions of a given sample based on its observed network state.  相似文献   

18.
The standard approach for identifying gene networks is based on experimental perturbations of gene regulatory systems such as gene knock-out experiments, followed by a genome-wide profiling of differential gene expressions. However, this approach is significantly limited in that it is not possible to perturb more than one or two genes simultaneously to discover complex gene interactions or to distinguish between direct and indirect downstream regulations of the differentially-expressed genes. As an alternative, genetical genomics study has been proposed to treat naturally-occurring genetic variants as potential perturbants of gene regulatory system and to recover gene networks via analysis of population gene-expression and genotype data. Despite many advantages of genetical genomics data analysis, the computational challenge that the effects of multifactorial genetic perturbations should be decoded simultaneously from data has prevented a widespread application of genetical genomics analysis. In this article, we propose a statistical framework for learning gene networks that overcomes the limitations of experimental perturbation methods and addresses the challenges of genetical genomics analysis. We introduce a new statistical model, called a sparse conditional Gaussian graphical model, and describe an efficient learning algorithm that simultaneously decodes the perturbations of gene regulatory system by a large number of SNPs to identify a gene network along with expression quantitative trait loci (eQTLs) that perturb this network. While our statistical model captures direct genetic perturbations of gene network, by performing inference on the probabilistic graphical model, we obtain detailed characterizations of how the direct SNP perturbation effects propagate through the gene network to perturb other genes indirectly. We demonstrate our statistical method using HapMap-simulated and yeast eQTL datasets. In particular, the yeast gene network identified computationally by our method under SNP perturbations is well supported by the results from experimental perturbation studies related to DNA replication stress response.  相似文献   

19.
Systems biology aims to develop mathematical models of biological systems by integrating experimental and theoretical techniques. During the last decade, many systems biological approaches that base on genome-wide data have been developed to unravel the complexity of gene regulation. This review deals with the reconstruction of gene regulatory networks (GRNs) from experimental data through computational methods. Standard GRN inference methods primarily use gene expression data derived from microarrays. However, the incorporation of additional information from heterogeneous data sources, e.g. genome sequence and protein–DNA interaction data, clearly supports the network inference process. This review focuses on promising modelling approaches that use such diverse types of molecular biological information. In particular, approaches are discussed that enable the modelling of the dynamics of gene regulatory systems. The review provides an overview of common modelling schemes and learning algorithms and outlines current challenges in GRN modelling.  相似文献   

20.
This paper presents a new statistical techniques — Bayesian Generalized Associative Functional Networks (GAFN), to model the dynamical plant growth process of greenhouse crops. GAFNs are able to incorporate the domain knowledge and data to model complex ecosystem. By use of the functional networks and Bayesian framework, the prior knowledge can be naturally embedded into the model, and the functional relationship between inputs and outputs can be learned during the training process. Our main interest is focused on the Generalized Associative Functional Networks (GAFNs), which are appropriate to model multiple variable processes. Three main advantages are obtained through the applications of Bayesian GAFN methods to modeling dynamic process of plant growth. Firstly, this approach provides a powerful tool for revealing some useful relationships between the greenhouse environmental factors and the plant growth parameters. Secondly, Bayesian GAFN can model Multiple-Input Multiple-Output (MIMO) systems from the given data, and presents a good generalization capability from the final single model for successfully fitting all 12 data sets over 5-year field experiments. Thirdly, the Bayesian GAFN method can also play as an optimization tool to estimate the interested parameter in the agro-ecosystem. In this work, two algorithms are proposed for the statistical inference of parameters in GAFNs. Both of them are based on the variational inference, also called variational Bayes (VB) techniques, which may provide probabilistic interpretations for the built models. VB-based learning methods are able to yield estimations of the full posterior probability of model parameters. Synthetic and real-world examples are implemented to confirm the validity of the proposed methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号