首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MOTIVATION: Although many network inference algorithms have been presented in the bioinformatics literature, no suitable approach has been formulated for evaluating their effectiveness at recovering models of complex biological systems from limited data. To overcome this limitation, we propose an approach to evaluate network inference algorithms according to their ability to recover a complex functional network from biologically reasonable simulated data. RESULTS: We designed a simulator to generate data representing a complex biological system at multiple levels of organization: behaviour, neural anatomy, brain electrophysiology, and gene expression of songbirds. About 90% of the simulated variables are unregulated by other variables in the system and are included simply as distracters. We sampled the simulated data at intervals as one would sample from a biological system in practice, and then used the sampled data to evaluate the effectiveness of an algorithm we developed for functional network inference. We found that our algorithm is highly effective at recovering the functional network structure of the simulated system-including the irrelevance of unregulated variables-from sampled data alone. To assess the reproducibility of these results, we tested our inference algorithm on 50 separately simulated sets of data and it consistently recovered almost perfectly the complex functional network structure underlying the simulated data. To our knowledge, this is the first approach for evaluating the effectiveness of functional network inference algorithms at recovering models from limited data. Our simulation approach also enables researchers a priori to design experiments and data-collection protocols that are amenable to functional network inference.  相似文献   

2.
3.
Cho KH  Choo SM  Wellstead P  Wolkenhauer O 《FEBS letters》2005,579(20):4520-4528
We propose a unified framework for the identification of functional interaction structures of biomolecular networks in a way that leads to a new experimental design procedure. In developing our approach, we have built upon previous work. Thus we begin by pointing out some of the restrictions associated with existing structure identification methods and point out how these restrictions may be eased. In particular, existing methods use specific forms of experimental algebraic equations with which to identify the functional interaction structure of a biomolecular network. In our work, we employ an extended form of these experimental algebraic equations which, while retaining their merits, also overcome some of their disadvantages. Experimental data are required in order to estimate the coefficients of the experimental algebraic equation set associated with the structure identification task. However, experimentalists are rarely provided with guidance on which parameters to perturb, and to what extent, to perturb them. When a model of network dynamics is required then there is also the vexed question of sample rate and sample time selection to be resolved. Supplying some answers to these questions is the main motivation of this paper. The approach is based on stationary and/or temporal data obtained from parameter perturbations, and unifies the previous approaches of Kholodenko et al. (PNAS 99 (2002) 12841-12846) and Sontag et al. (Bioinformatics 20 (2004) 1877-1886). By way of demonstration, we apply our unified approach to a network model which cannot be properly identified by existing methods. Finally, we propose an experiment design methodology, which is not limited by the amount of parameter perturbations, and illustrate its use with an in numero example.  相似文献   

4.
Building a meaningful model of biological regulatory network is usually done by specifying the components (e.g. the genes) and their interactions, by guessing the values of parameters, by comparing the predicted behaviors to the observed ones, and by modifying in a trial-error process both architecture and parameters in order to reach an optimal fitness. We propose here a different approach to construct and analyze biological models avoiding the trial-error part, where structure and dynamics are represented as formal constraints. We apply the method to Hopfield-like networks, a formalism often used in both neural and regulatory networks modeling. The aim is to characterize automatically the set of all models consistent with all the available knowledge (about structure and behavior). The available knowledge is formalized into formal constraints. The latter are compiled into Boolean formula in conjunctive normal form and then submitted to a Boolean satisfiability solver. This approach allows to formulate a wide range of queries, expressed in a high level language, and possibly integrating formalized intuitions. In order to explore its potential, we use it to find cycles for 3-nodes networks and to determine the flower morphogenesis regulatory network of Arabidopsis thaliana. Applications of this technique are numerous and concern the building of models from data as well as the design of biological networks possessing specified behaviors.  相似文献   

5.
Research data management (RDM) requires standards, policies, and guidelines. Findable, accessible, interoperable, and reusable (FAIR) data management is critical for sustainable research. Therefore, collaborative approaches for managing FAIR-structured data are becoming increasingly important for long-term, sustainable RDM. However, they are rather hesitantly applied in bioengineering. One of the reasons may be found in the interdisciplinary character of the research field. In addition, bioengineering as application of principles of biology and tools of process engineering, often have to meet different criteria. In consequence, RDM is complicated by the fact that researchers from different scientific institutions must meet the criteria of their home institution, which can lead to additional conflicts. Therefore, centrally provided general repositories implementing a collaborative approach that enables data storage from the outset In a biotechnology research network with over 20 tandem projects, it was demonstrated how FAIR-RDM can be implemented through a collaborative approach and the use of a data structure. In addition, the importance of a structure within a repository was demonstrated to keep biotechnology research data available throughout the entire data lifecycle. Furthermore, the biotechnology research network highlighted the importance of a structure within a repository to keep research data available throughout the entire data lifecycle.  相似文献   

6.
Bayesian networks can be used to identify possible causal relationships between variables based on their conditional dependencies and independencies, which can be particularly useful in complex biological scenarios with many measured variables. Here we propose two improvements to an existing method for Bayesian network analysis, designed to increase the power to detect potential causal relationships between variables (including potentially a mixture of both discrete and continuous variables). Our first improvement relates to the treatment of missing data. When there is missing data, the standard approach is to remove every individual with any missing data before performing analysis. This can be wasteful and undesirable when there are many individuals with missing data, perhaps with only one or a few variables missing. This motivates the use of imputation. We present a new imputation method that uses a version of nearest neighbour imputation, whereby missing data from one individual is replaced with data from another individual, their nearest neighbour. For each individual with missing data, the subsets of variables to be used to select the nearest neighbour are chosen by sampling without replacement the complete data and estimating a best fit Bayesian network. We show that this approach leads to marked improvements in the recall and precision of directed edges in the final network identified, and we illustrate the approach through application to data from a recent study investigating the causal relationship between methylation and gene expression in early inflammatory arthritis patients. We also describe a second improvement in the form of a pseudo-Bayesian approach for upweighting certain network edges, which can be useful when there is prior evidence concerning their directions.  相似文献   

7.
8.
What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology.  相似文献   

9.
Robust signal processing for embedded systems requires the effective identification and representation of features within raw sensory data. This task is inherently difficult due to unavoidable long-term changes in the sensory systems and/or the sensed environment. In this paper we explore four variations of competitive learning and examine their suitability as an unsupervised technique for the automated identification of data clusters within a given input space. The relative performance of the four techniques is evaluated through their ability to effectively represent the structure underlying artificial and real-world data distributions. As a result of this study it was found that frequency sensitive competitive learning provides both reliable and efficient solutions to complex data distributions. As well, frequency sensitive and soft competitive learning are shown to exhibit properties which may permit the evolution of an appropriate network structure through the use of growing or pruning procedures.  相似文献   

10.
11.
Perturbation experiments for example using RNA interference (RNAi) offer an attractive way to elucidate gene function in a high throughput fashion. The placement of hit genes in their functional context and the inference of underlying networks from such data, however, are challenging tasks. One of the problems in network inference is the exponential number of possible network topologies for a given number of genes. Here, we introduce a novel mathematical approach to address this question. We formulate network inference as a linear optimization problem, which can be solved efficiently even for large-scale systems. We use simulated data to evaluate our approach, and show improved performance in particular on larger networks over state-of-the art methods. We achieve increased sensitivity and specificity, as well as a significant reduction in computing time. Furthermore, we show superior performance on noisy data. We then apply our approach to study the intracellular signaling of human primary nave CD4+ T-cells, as well as ErbB signaling in trastuzumab resistant breast cancer cells. In both cases, our approach recovers known interactions and points to additional relevant processes. In ErbB signaling, our results predict an important role of negative and positive feedback in controlling the cell cycle progression.  相似文献   

12.
Indirect interactions play an essential role in governing population, community and coevolutionary dynamics across a diverse range of ecological communities. Such communities are widely represented as bipartite networks: graphs depicting interactions between two groups of species, such as plants and pollinators or hosts and parasites. For over thirty years, studies have used indices, such as connectance and species degree, to characterise the structure of these networks and the roles of their constituent species. However, compressing a complex network into a single metric necessarily discards large amounts of information about indirect interactions. Given the large literature demonstrating the importance and ubiquity of indirect effects, many studies of network structure are likely missing a substantial piece of the ecological puzzle. Here we use the emerging concept of bipartite motifs to outline a new framework for bipartite networks that incorporates indirect interactions. While this framework is a significant departure from the current way of thinking about bipartite ecological networks, we show that this shift is supported by analyses of simulated and empirical data. We use simulations to show how consideration of indirect interactions can highlight differences missed by the current index paradigm that may be ecologically important. We extend this finding to empirical plant–pollinator communities, showing how two bee species, with similar direct interactions, differ in how specialised their competitors are. These examples underscore the need to not rely solely on network‐ and species‐level indices for characterising the structure of bipartite ecological networks.  相似文献   

13.
Since metabolome data are derived from the underlying metabolic network, reverse engineering of such data to recover the network topology is of wide interest. Lyapunov equation puts a constraint to the link between data and network by coupling the covariance of data with the strength of interactions (Jacobian matrix). This equation, when expressed as a linear set of equations at steady state, constitutes a basis to infer the network structure given the covariance matrix of data. The sparse structure of metabolic networks points to reactions which are active based on minimal enzyme production, hinting at sparsity as a cellular objective. Therefore, for a given covariance matrix, we solved Lyapunov equation to calculate Jacobian matrix by a simultaneous use of minimization of Euclidean norm of residuals and maximization of sparsity (the number of zeros in Jacobian matrix) as objective functions to infer directed small-scale networks from three kingdoms of life (bacteria, fungi, mammalian). The inference performance of the approach was found to be promising, with zero False Positive Rate, and almost one True positive Rate. The effect of missing data on results was additionally analyzed, revealing superiority over similarity-based approaches which infer undirected networks. Our findings suggest that the covariance of metabolome data implies an underlying network with sparsest pattern. The theoretical analysis forms a framework for further investigation of sparsity-based inference of metabolic networks from real metabolome data.  相似文献   

14.
Systems biology aims to study the properties of biological systems in terms of the properties of their molecular constituents. This occurs frequently by a process of mathematical modelling. The first step in this modelling process is to unravel the interaction structure of biological systems from experimental data. Previously, an algorithm for gene network inference from gene expression perturbation data was proposed. Here, the algorithm is extended by using regression with subset selection. The performance of the algorithm is extensively evaluated on a set of data produced with gene network models at different levels of simulated experimental noise. Regression with subset selection outperforms the previously stated matrix inverse approach in the presence of experimental noise. Furthermore, this regression approach enables us to deal with under-determination, that is, when not all genes are perturbed. The results on incomplete data sets show that the new method performs well at higher number of perturbations, even when noise levels are high. At lower number of perturbations, although still being able to recover the majority of the connections, less confidence can be placed in the recovered edges.  相似文献   

15.
Network representations of biological systems are widespread and reconstructing unknown networks from data is a focal problem for computational biologists. For example, the series of biochemical reactions in a metabolic pathway can be represented as a network, with nodes corresponding to metabolites and edges linking reactants to products. In a different context, regulatory relationships among genes are commonly represented as directed networks with edges pointing from influential genes to their targets. Reconstructing such networks from data is a challenging problem receiving much attention in the literature. There is a particular need for approaches tailored to time-series data and not reliant on direct intervention experiments, as the former are often more readily available. In this paper, we introduce an approach to reconstructing directed networks based on dynamic systems models. Our approach generalizes commonly used ODE models based on linear or nonlinear dynamics by extending the functional class for the functions involved from parametric to nonparametric models. Concomitantly we limit the complexity by imposing an additive structure on the estimated slope functions. Thus the submodel associated with each node is a sum of univariate functions. These univariate component functions form the basis for a novel coupling metric that we define in order to quantify the strength of proposed relationships and hence rank potential edges. We show the utility of the method by reconstructing networks using simulated data from computational models for the glycolytic pathway of Lactocaccus Lactis and a gene network regulating the pluripotency of mouse embryonic stem cells. For purposes of comparison, we also assess reconstruction performance using gene networks from the DREAM challenges. We compare our method to those that similarly rely on dynamic systems models and use the results to attempt to disentangle the distinct roles of linearity, sparsity, and derivative estimation.  相似文献   

16.
Artificial neural networks and their use in quantitative pathology   总被引:2,自引:0,他引:2  
A brief general introduction to artificial neural networks is presented, examining in detail the structure and operation of a prototype net developed for the solution of a simple pattern recognition problem in quantitative pathology. The process by which a neural network learns through example and gradually embodies its knowledge as a distributed representation is discussed, using this example. The application of neurocomputer technology to problems in quantitative pathology is explored, using real-world and illustrative examples. Included are examples of the use of artificial neural networks for pattern recognition, database analysis and machine vision. In the context of these examples, characteristics of neural nets, such as their ability to tolerate ambiguous, noisy and spurious data and spontaneously generalize from known examples to handle unfamiliar cases, are examined. Finally, the strengths and deficiencies of a connectionist approach are compared to those of traditional symbolic expert system methodology. It is concluded that artificial neural networks, used in conjunction with other nonalgorithmic artificial intelligence techniques and traditional algorithmic processing, may provide useful software engineering tools for the development of systems in quantitative pathology.  相似文献   

17.
Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to discover whether there are rules or regularities governing development and evolution of complex multi-cellular organisms.  相似文献   

18.
Proteins are the active players in performing essential molecular activities throughout biology, and their dynamics has been broadly demonstrated to relate to their mechanisms. The intrinsic fluctuations have often been used to represent their dynamics and then compared to the experimental B-factors. However, proteins do not move in a vacuum and their motions are modulated by solvent that can impose forces on the structure. In this paper, we introduce a new structural concept, which has been called the structural compliance, for the evaluation of the global and local deformability of the protein structure in response to intramolecular and solvent forces. Based on the application of pairwise pulling forces to a protein elastic network, this structural quantity has been computed and sometimes is even found to yield an improved correlation with the experimental B-factors, meaning that it may serve as a better metric for protein flexibility. The inverse of structural compliance, namely the structural stiffness, has also been defined, which shows a clear anticorrelation with the experimental data. Although the present applications are made to proteins, this approach can also be applied to other biomolecular structures such as RNA. This present study considers only elastic network models, but the approach could be applied further to conventional atomic molecular dynamics. Compliance is found to have a slightly better agreement with the experimental B-factors, perhaps reflecting its bias toward the effects of local perturbations, in contrast to mean square fluctuations. The code for calculating protein compliance and stiffness is freely accessible at https://jerniganlab.github.io/Software/PACKMAN/Tutorials/compliance .  相似文献   

19.
Actin filaments are a major component of the cytoskeleton and play a crucial role in cell mechanotransduction. F-actin networks can be reconstituted in vitro and their mechanical behaviour has been studied experimentally. Constitutive models that assume an idealised network structure, in combination with a non-affine network deformation, have been successful in capturing the elastic response of the network. In this study, an affine network deformation is assumed, in which we propose an alternative 3D finite strain constitutive model. The model makes use of a micro-sphere to calculate the strain energy density of the network, which is represented as a continuous distribution of filament orientations in space. By incorporating a simplified sliding mechanism at the filament-to-filament junctions, premature filament locking, inherent to affine network deformation, could be avoided. The model could successfully fit experimental shear data for a specific cross-linked F-actin network, demonstrating the potential of the novel model.  相似文献   

20.
Collagens are a family of at least 30 protein types organized as networks. They constitute the main support material of cells under the form of extracellular matrix as well as for membranes in vessels, organs, and tissue compartments. Collagen network abnormalities are at the origin of many diseases, including myopathies and fibroses. The characterization of collagens remains an analytical challenge due to the insolubility of these molecules and the difficulty encountered in isolating given types without altering their structure or in maintaining network organization, which is critical to diagnosing related pathologies. We have proposed using a vibrational spectroscopy based imaging technique, namely Fourier-transform infrared (FTIR) imaging, for a spatially-resolved analysis of secondary structure of different collagen types in complex samples, and more specifically for characterizing gliomas. With newly developed spectral data treatments and chemometrics using secondary structure parameters of collagen proteins, FTIR imaging is now able to distinguish between several types. On this basis, gliomas have been investigated as specific collagen-rich tissues developing in a non-collagenous environment, providing high specificity to this FTIR imaging utilization. Here, we review the recent advances in this imaging approach for understanding glioma development, with FTIR imaging now being proposed as a molecular histopathology tool for clinicians.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号