首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Path matching and graph matching in biological networks.   总被引:2,自引:0,他引:2  
We develop algorithms for the following path matching and graph matching problems: (i) given a query path p and a graph G, find a path p' that is most similar to p in G; (ii) given a query graph G (0) and a graph G, find a graph G (0)' that is most similar to G (0) in G. In these problems, p and G (0) represent a given substructure of interest to a biologist, and G represents a large network in which the biologist desires to find a related substructure. These algorithms allow the study of common substructures in biological networks in order to understand how these networks evolve both within and between organisms. We reduce the path matching problem to finding a longest weighted path in a directed acyclic graph and show that the problem of finding top k suboptimal paths can be solved in polynomial time. This is in contrast with most previous approaches that used exponential time algorithms to find simple paths which are practical only when the paths are short. We reduce the graph matching problem to finding highest scoring subgraphs in a graph and give an exact algorithm to solve the problem when the query graph G (0) is of moderate size. This eliminates the need for less accurate heuristic or randomized algorithms.We show that our algorithms are able to extract biologically meaningful pathways from protein interaction networks in the DIP database and metabolic networks in the KEGG database. Software programs implementing these techniques (PathMatch and GraphMatch) are available at http://faculty.cs.tamu.edu/shsze/pathmatch and http://faculty.cs.tamu.edu/shsze/graphmatch.  相似文献   

2.
Graph models of habitat mosaics   总被引:7,自引:0,他引:7  
Graph theory is a body of mathematics dealing with problems of connectivity, flow, and routing in networks ranging from social groups to computer networks. Recently, network applications have erupted in many fields, and graph models are now being applied in landscape ecology and conservation biology, particularly for applications couched in metapopulation theory. In these applications, graph nodes represent habitat patches or local populations and links indicate functional connections among populations (i.e. via dispersal). Graphs are models of more complicated real systems, and so it is appropriate to review these applications from the perspective of modelling in general. Here we review recent applications of network theory to habitat patches in landscape mosaics. We consider (1) the conceptual model underlying these applications; (2) formalization and implementation of the graph model; (3) model parameterization; (4) model testing, insights, and predictions available through graph analyses; and (5) potential implications for conservation biology and related applications. In general, and for a variety of ecological systems, we find the graph model a remarkably robust framework for applications concerned with habitat connectivity. We close with suggestions for further work on the parameterization and validation of graph models, and point to some promising analytic insights.  相似文献   

3.
Graph representations have been widely used to analyze and design various economic, social, military, political, and biological networks. In systems biology, networks of cells and organs are useful for understanding disease and medical treatments and, in structural biology, structures of molecules can be described, including RNA structures. In our RNA-As-Graphs (RAG) framework, we represent RNA structures as tree graphs by translating unpaired regions into vertices and helices into edges. Here we explore the modularity of RNA structures by applying graph partitioning known in graph theory to divide an RNA graph into subgraphs. To our knowledge, this is the first application of graph partitioning to biology, and the results suggest a systematic approach for modular design in general. The graph partitioning algorithms utilize mathematical properties of the Laplacian eigenvector (µ2) corresponding to the second eigenvalues (λ2) associated with the topology matrix defining the graph: λ2 describes the overall topology, and the sum of µ2′s components is zero. The three types of algorithms, termed median, sign, and gap cuts, divide a graph by determining nodes of cut by median, zero, and largest gap of µ2′s components, respectively. We apply these algorithms to 45 graphs corresponding to all solved RNA structures up through 11 vertices (∼220 nucleotides). While we observe that the median cut divides a graph into two similar-sized subgraphs, the sign and gap cuts partition a graph into two topologically-distinct subgraphs. We find that the gap cut produces the best biologically-relevant partitioning for RNA because it divides RNAs at less stable connections while maintaining junctions intact. The iterative gap cuts suggest basic modules and assembly protocols to design large RNA structures. Our graph substructuring thus suggests a systematic approach to explore the modularity of biological networks. In our applications to RNA structures, subgraphs also suggest design strategies for novel RNA motifs.  相似文献   

4.
An approach is presented for computing meaningful pathways in the network of small molecule metabolism comprising the chemical reactions characterized in all organisms. The metabolic network is described as a weighted graph in which all the compounds are included, but each compound is assigned a weight equal to the number of reactions in which it participates. Path finding is performed in this graph by searching for one or more paths with lowest weight. Performance is evaluated systematically by computing paths between the first and last reactions in annotated metabolic pathways, and comparing the intermediate reactions in the computed pathways to those in the annotated ones. For the sake of comparison, paths are computed also in the un-weighted raw (all compounds and reactions) and filtered (highly connected pool metabolites removed) metabolic graphs, respectively. The correspondence between the computed and annotated pathways is very poor (<30%) in the raw graph; increasing to approximately 65% in the filtered graph; reaching approximately 85% in the weighted graph. Considering the best-matching path among the five lightest paths increases the correspondence to 92%, on average. We then show that the average distance between pairs of metabolites is significantly larger in the weighted graph than in the raw unfiltered graph, suggesting that the small-world properties previously reported for metabolic networks probably result from irrelevant shortcuts through pool metabolites. In addition, we provide evidence that the length of the shortest path in the weighted graph represents a valid measure of the "metabolic distance" between enzymes. We suggest that the success of our simplistic approach is rooted in the high degree of specificity of the reactions in metabolic pathways, presumably reflecting thermodynamic constraints operating in these pathways. We expect our approach to find useful applications in inferring metabolic pathways in newly sequenced genomes.  相似文献   

5.
Mapping the detailed connectivity patterns (connectomes) of neural circuits is a central goal of neuroscience. The best quantitative approach to analyzing connectome data is still unclear but graph theory has been used with success. We present a graph theoretical model of the posterior lateral line sensorimotor pathway in zebrafish. The model includes 2,616 neurons and 167,114 synaptic connections. Model neurons represent known cell types in zebrafish larvae, and connections were set stochastically following rules based on biological literature. Thus, our model is a uniquely detailed computational representation of a vertebrate connectome. The connectome has low overall connection density, with 2.45% of all possible connections, a value within the physiological range. We used graph theoretical tools to compare the zebrafish connectome graph to small-world, random and structured random graphs of the same size. For each type of graph, 100 randomly generated instantiations were considered. Degree distribution (the number of connections per neuron) varied more in the zebrafish graph than in same size graphs with less biological detail. There was high local clustering and a short average path length between nodes, implying a small-world structure similar to other neural connectomes and complex networks. The graph was found not to be scale-free, in agreement with some other neural connectomes. An experimental lesion was performed that targeted three model brain neurons, including the Mauthner neuron, known to control fast escape turns. The lesion decreased the number of short paths between sensory and motor neurons analogous to the behavioral effects of the same lesion in zebrafish. This model is expandable and can be used to organize and interpret a growing database of information on the zebrafish connectome.  相似文献   

6.
There are two key characteristic of animal and human societies: (1) degree heterogeneity, meaning that not all individual have the same number of associates; and (2) the interaction topology is not static, i.e. either individuals interact with different set of individuals at different times of their life, or at least they have different associations than their parents. Earlier works have shown that population structure is one of the mechanisms promoting cooperation. However, most studies had assumed that the interaction network can be described by a regular graph (homogeneous degree distribution). Recently there are an increasing number of studies employing degree heterogeneous graphs to model interaction topology. But mostly the interaction topology was assumed to be static. Here we investigate the fixation probability of the cooperator strategy in the prisoner's dilemma, when interaction network is a random regular graph, a random graph or a scale-free graph and the interaction network is allowed to change.  相似文献   

7.
Annotation of cells in single-cell clustering requires a homogeneous grouping of cell populations. There are various issues in single cell sequencing that effect homogeneous grouping (clustering) of cells, such as small amount of starting RNA, limited per-cell sequenced reads, cell-to-cell variability due to cell-cycle, cellular morphology, and variable reagent concentrations. Moreover, single cell data is susceptible to technical noise, which affects the quality of genes (or features) selected/extracted prior to clustering.Here we introduce sc-CGconv (copula based graph convolution network for single clustering), a stepwise robust unsupervised feature extraction and clustering approach that formulates and aggregates cell–cell relationships using copula correlation (Ccor), followed by a graph convolution network based clustering approach. sc-CGconv formulates a cell-cell graph using Ccor that is learned by a graph-based artificial intelligence model, graph convolution network. The learned representation (low dimensional embedding) is utilized for cell clustering. sc-CGconv features the following advantages. a. sc-CGconv works with substantially smaller sample sizes to identify homogeneous clusters. b. sc-CGconv can model the expression co-variability of a large number of genes, thereby outperforming state-of-the-art gene selection/extraction methods for clustering. c. sc-CGconv preserves the cell-to-cell variability within the selected gene set by constructing a cell-cell graph through copula correlation measure. d. sc-CGconv provides a topology-preserving embedding of cells in low dimensional space.  相似文献   

8.
景观结构和空间格局一直是景观生态学的核心问题,图论的应用为景观格局的分析提供了一种研究框架,基于图论的景观图逐渐被应用于生物多样性保护的连通性建模和景观规划的决策支持研究,景观图的表达、分析和应用已成为保护生物学和景观生态学研究的热点。本文首先介绍了景观图的图论基础,在Scopus数据库的基础上,检索了1993—2019年在标题、摘要和关键词中出现 “landscape graph”、“connectivity”和“network”词汇的257篇已发表的期刊论文。从年发文量、来源期刊、研究区域、研究机构、景观类型等方面分析了该领域的研究进展和发展趋势。分析发现,2017年之前,发表的期刊论文数量整体呈上升趋势,2017年之后年发文量逐年下降;主要研究力量集中在美国、西班牙、法国、加拿大和中国,发文量占到86.8%。大部分研究成果发表在“Landscape Ecology”、“Landscape and Urban Planning”和“Biological Conservation”期刊上。在研究内容上,景观图表达主要包括点的定义、边的度量和景观的模拟3方面,景观图分析研究包括分析指数、景观图划分两方面。我们重点关注了景观图在生物多样性保护、景观(生态网络)规划和管理、景观影响评价等科学与实践中的应用。基于图论的景观图通过帮助理解景观连通性变化、动物行为和栖息地保护,影响着保护科学和规划实践者。图论对保护科学和规划的影响来自于丰富的理论基础和成熟的研究方法,基于图论的景观图为景观结构和格局的生态学理解提供了跳板,并将继续成为全球研究人员和实践者的重要工具。  相似文献   

9.
高梅香  朱家祺  刘爽  程鑫  刘冬  李彦胜 《生态学报》2023,43(16):6862-6877
土壤动物学面临以全新知识体系为科学研究框架的变革时期,其核心内容是以数据驱动为主要特征的人工智能技术方法。目前广泛应用的基于数据库的数据处理分析方法,面临着数据多源异构、快速增长和处理能力不足之间的矛盾。基于快速发展的大数据科学和人工智能技术的数据挖掘方法在解决前述矛盾中有突出优势,但需要依赖一个强大的领域知识库,然而土壤动物领域知识图谱的研究十分匮乏。土壤动物知识图谱是一个具有有向图结构的知识库,其中图的节点代表与土壤动物相关的实体或概念,图的边代表实体或概念之间的各种语义关系。提出了土壤动物知识图谱的定义、内涵、理论模型和构建方法,以浙江天目山土壤螨类多样性为例,分析了构建山地土壤动物知识图谱的技术方法;以土壤动物多样性研究关注的物种分布、物种共存、环境条件对物种的影响作用为例,探讨了基于山地土壤动物知识图谱可以解决的相关科学问题。研究表明,土壤动物知识图谱在解决生物多样性重要科学问题方面具有独特的潜力和优势,有力推动了土壤动物学、信息科学和数据科学交叉的土壤动物信息学的发展。  相似文献   

10.
Three dimensional DNA structures in computing   总被引:13,自引:0,他引:13  
Jonoska N  Karl SA  Saito M 《Bio Systems》1999,52(1-3):143-153
We show that 3-dimensional graph structures can be used for solving computational problems with DNA molecules. Vertex building blocks consisting of k-armed (k = 3 or 4) branched junction molecules are used to form graphs. We present procedures for the 3-SAT and 3-vertex-colorability problems. Construction of one graph structure (in many copies) is sufficient to determine the solution to the problem. In our proposed procedure for 3-SAT, the number of steps required is equal to the number of variables in the formula. For the 3-vertex-colorability problem, the procedure requires a constant number of steps regardless of the size of the graph.  相似文献   

11.
Dense subgraphs of Protein-Protein Interaction (PPI) graphs are assumed to be potential functional modules and play an important role in inferring the functional behavior of proteins. Increasing amount of available PPI data implies a fast, accurate approach of biological complex identification. Therefore, there are different models and algorithms in identifying functional modules. This paper describes a new graph theoretic clustering algorithm that detects densely connected regions in a large PPI graph. The method is based on finding bounded diameter subgraphs around a seed node. The algorithm has the advantage of being very simple and efficient when compared with other graph clustering methods. This algorithm is tested on the yeast PPI graph and the results are compared with MCL, Core-Attachment, and MCODE algorithms.  相似文献   

12.
《IRBM》2022,43(5):333-339
1) ObjectivesPreterm birth caused by preterm labor is one of the major health problems in the world. In this article, we present a new framework for dealing with this problem through the processing of electrohysterographic signals (EHG) that are recorded during labor and pregnancy. The objective in this research is to improve the classification between labor and pregnancy contractions by using a new approach that focuses on the connectivity analysis based on graph parameters, representative of uterine synchronization, and comparing neural network and machine learning methods in order to classify between labor and pregnancy.2) Material and methodsafter denoising of the 16 EHG signals recorded from pregnant women abdomen, we applied different connectivity methods to obtain connectivity matrices; then by using the graph theory, we extracted some graph parameters from the connectivity matrices; finally, we tested different neural network and machine learning methods on the features obtained from both graph and connectivity methods in order to classify between labor and pregnancy.3) ResultsThe best results were obtained by using the logistic regression method. We also evidence the power of graph parameters extracted from the connectivity matrices to improve the classification results.4) ConclusionThe use of graph analysis associated with machine learning methods can be a powerful tool to improve labor and pregnancy classification based on the analysis of EHG signals.  相似文献   

13.
We present an RNA-As-Graphs (RAG) based inverse folding algorithm, RAG-IF, to design novel RNA sequences that fold onto target tree graph topologies. The algorithm can be used to enhance our recently reported computational design pipeline (Jain et al., NAR 2018). The RAG approach represents RNA secondary structures as tree and dual graphs, where RNA loops and helices are coarse-grained as vertices and edges, opening the usage of graph theory methods to study, predict, and design RNA structures. Our recently developed computational pipeline for design utilizes graph partitioning (RAG-3D) and atomic fragment assembly (F-RAG) to design sequences to fold onto RNA-like tree graph topologies; the atomic fragments are taken from existing RNA structures that correspond to tree subgraphs. Because F-RAG may not produce the target folds for all designs, automated mutations by RAG-IF algorithm enhance the candidate pool markedly. The crucial residues for mutation are identified by differences between the predicted and the target topology. A genetic algorithm then mutates the selected residues, and the successful sequences are optimized to retain only the minimal or essential mutations. Here we evaluate RAG-IF for 6 RNA-like topologies and generate a large pool of successful candidate sequences with a variety of minimal mutations. We find that RAG-IF adds robustness and efficiency to our RNA design pipeline, making inverse folding motivated by graph topology rather than secondary structure more productive.  相似文献   

14.
ABSTRACT: BACKGROUND: Identification of protein structural cores requires isolation of sets of proteins all sharing a same subset of structural motifs. In the context of ever growing number of available 3D protein structures, standard and automatic clustering algorithms require adaptations so as to allow for efficient identification of such sets of proteins. RESULTS: When considering a pair of 3D structures, they are stated as similar or not according to the local similarities of their matching substructures in a structural alignment. This binary relation can be represented in a graph of similarities where a node represents a 3D protein structure and an edge states that two 3D protein structures are similar. Therefore, the classification of proteins into structural families can be viewed as graph clustering task. Unfortunately, because such a graph encodes only pairwise similarity information, clustering algorithms may group in the same cluster a subset of 3D structures that do not share a common substructure. To overcome this drawback we first define a ternary similarity on a triple of 3D structures as a constraint to be satisfied by the graph of similarities. Such a ternary constraint takes into account similarities between pairwise alignments, so as to ensure that the three involved protein structures do have some common substructure. We propose hereunder a modification algorithm that eliminates edges from the original graph of similarities and outputs a reduced graph in which no ternary constraints are violated. Our proposition is then first to build a graph of similarities, then to reduce the graph according to the modification algorithm, and finally to apply to the reduced graph a standard graph clustering algorithm. We applied this method to ASTRAL-40 non-redundant protein domains, identifying significant pairwise similarities with Yakusa, a program devised for rapid 3D structure alignments. CONCLUSIONS: We show that filtering similarities prior to standard graph based clustering process by applying ternary similarity constraints i) improves the separation of proteins of different classes and consequently ii) improves the classification quality of standard graph based clustering algorithms according to the reference classification SCOP.  相似文献   

15.
Behaviour of chemical networks that are described by systems of ordinary differential equations can be analysed via the associated graph structures. This paper deals with observations based on the interaction graph which is defined by the signs of the Jacobian matrix entries. Some of the important graph structures linked to network dynamics are signed circuits and the nucleus (or Hamiltonian hooping). We use mass-action chemical reaction networks as an example to showcase interesting observations about the aforementioned interaction graph structures. We show that positive circuits and specific nucleus structures (associated to multistationarity) are always present in a great generic class of mass-action chemical and biological networks. The theory of negative circuits remains poorly understood, but there is some evidence that they are indicators of stable periodicity. Here we introduce the concept of non-isolated circuits which indicate the presence of a negative circuit.  相似文献   

16.
The directed Hamiltonian path (DHP) problem is one of the hard computational problems for which there is no practical algorithm on a conventional computer available. Many problems, including the traveling sales person problem and the longest path problem, can be translated into the DHP problem, which implies that an algorithm for DHP can also solve all the translated problems. To study the robustness of the laboratory protocol of the pioneering DNA computing for the DHP problem performed by Leonard Adleman (1994), we investigated how the graph size, multiplicity of the Hamiltonian paths, and the size of oligonucleotides that encode the vertices would affect the laboratory procedures. We applied Adleman's protocol with 18-mer oligonucleotide per node to a graph with 8 vertices and 14 edges containing two Hamiltonian paths (Adleman used 20-mer oligonucleotides for a graph with 7 nodes, 14 edges and one Hamiltonian path). We found that depending on the graph characteristics such as the number of short cycles, the oligonucleotide size, and the hybridization conditions that used to encode the graph, the protocol should be executed with different parameters from Adleman's.  相似文献   

17.
We study nonlinear electrical oscillator networks, the smallest example of which consists of a voltage-dependent capacitor, an inductor, and a resistor driven by a pure tone source. By allowing the network topology to be that of any connected graph, such circuits generalize spatially discrete nonlinear transmission lines/lattices that have proven useful in high-frequency analog devices. For such networks, we develop two algorithms to compute the steady-state response when a subset of nodes are driven at the same fixed frequency. The algorithms we devise are orders of magnitude more accurate and efficient than stepping towards the steady-state using a standard numerical integrator. We seek to enhance a given network''s nonlinear behavior by altering the eigenvalues of the graph Laplacian, i.e., the resonances of the linearized system. We develop a Newton-type method that solves for the network inductances such that the graph Laplacian achieves a desired set of eigenvalues; this method enables one to move the eigenvalues while keeping the network topology fixed. Running numerical experiments using three different random graph models, we show that shrinking the gap between the graph Laplacian''s first two eigenvalues dramatically improves a network''s ability to (i) transfer energy to higher harmonics, and (ii) generate large-amplitude signals. Our results shed light on the relationship between a network''s structure, encoded by the graph Laplacian, and its function, defined in this case by the presence of strongly nonlinear effects in the frequency response.  相似文献   

18.
The influence of cytosine-arabinoside (ara-C) on the nuclear size of monolayer cells from mouse brains were studied by quantitative analysis with the Quantimet (area measurement) and linear analysis with the Digiscan. Ara C (0,1; 1,0; 10,0 mug/ml) causes a significant nuclear enlargement, depending on the dose of 16%, 30% and 35% respectively and at the same time a decrease in cell density and an inhibition of mitotic activity. The size-distribution graph is altered by ara-C: the graph flattens and widens at the base. Moreover, in the case of 10,0 mug ara-C/ml a pronounced graph shoulder is visible in the small nuclear size region which can be interpreted as a slight accumulation of small nuclei. The nuclear enlargement is possibly due to an accumulation of shortchain polynucleotides.  相似文献   

19.
20.
A graph theoretical analysis of nuclear magnetic resonance (NMR) data of six different protein interactions has been presented. The representation of the protein interaction data as a graph or network reveals that all of the studied interactions are based on a common functional concept. They all involve a single densely packed hub of functionally correlated residues that mediate the ligand binding events. This is found independent of the kind of protein (folded or unfolded) or ligand (protein, polymer or small molecule). Furthermore, the power of the graph analysis is demonstrated at the examples of the Calmodulin (CaM)/Calcium and the Cold Shock Protein A (CspA)/RNA interaction. The presented approach enables the precise determination of multiple binding sites for the respective ligand molecules.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号