首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The assessment of phylogenetic network reconstruction methods requires the ability to compare phylogenetic networks. This is the second in a series of papers devoted to the analysis and comparison of metrics for tree-child time consistent phylogenetic networks on the same set of taxa. In this paper, we generalize to phylogenetic networks two metrics that have already been introduced in the literature for phylogenetic trees: the nodal distance and the triplets distance. We prove that they are metrics on any class of tree-child time consistent phylogenetic networks on the same set of taxa, as well as some basic properties for them. To prove these results, we introduce a reduction/expansion procedure that can be used not only to establish properties of tree-child time consistent phylogenetic networks by induction, but also to generate all tree-child time consistent phylogenetic networks with a given number of leaves.  相似文献   

2.
We prove that Nakhleh's metric for reduced phylogenetic networks is also a metric on the classes of tree-child phylogenetic networks, semibinary tree-sibling time consistent phylogenetic networks, and multilabeled phylogenetic trees. We also prove that it separates distinguishable phylogenetic networks. In this way, it becomes the strongest dissimilarity measure for phylogenetic networks available so far. Furthermore, we propose a generalization of that metric that separates arbitrary phylogenetic networks.  相似文献   

3.
Phylogenetic networks are a generalization of evolutionary trees that are used by biologists to represent the evolution of organisms which have undergone reticulate evolution. Essentially, a phylogenetic network is a directed acyclic graph having a unique root in which the leaves are labelled by a given set of species. Recently, some approaches have been developed to construct phylogenetic networks from collections of networks on 2- and 3-leaved networks, which are known as binets and trinets, respectively. Here we study in more depth properties of collections of binets, one of the simplest possible types of networks into which a phylogenetic network can be decomposed. More specifically, we show that if a collection of level-1 binets is compatible with some binary network, then it is also compatible with a binary level-1 network. Our proofs are based on useful structural results concerning lowest stable ancestors in networks. In addition, we show that, although the binets do not determine the topology of the network, they do determine the number of reticulations in the network, which is one of its most important parameters. We also consider algorithmic questions concerning binets. We show that deciding whether an arbitrary set of binets is compatible with some network is at least as hard as the well-known graph isomorphism problem. However, if we restrict to level-1 binets, it is possible to decide in polynomial time whether there exists a binary network that displays all the binets. We also show that to find a network that displays a maximum number of the binets is NP-hard, but that there exists a simple polynomial-time 1/3-approximation algorithm for this problem. It is hoped that these results will eventually assist in the development of new methods for constructing phylogenetic networks from collections of smaller networks.  相似文献   

4.
Computational network analysis provides new methods to analyze the brain''s structural organization based on diffusion imaging tractography data. Networks are characterized by global and local metrics that have recently given promising insights into diagnosis and the further understanding of psychiatric and neurologic disorders. Most of these metrics are based on the idea that information in a network flows along the shortest paths. In contrast to this notion, communicability is a broader measure of connectivity which assumes that information could flow along all possible paths between two nodes. In our work, the features of network metrics related to communicability were explored for the first time in the healthy structural brain network. In addition, the sensitivity of such metrics was analysed using simulated lesions to specific nodes and network connections. Results showed advantages of communicability over conventional metrics in detecting densely connected nodes as well as subsets of nodes vulnerable to lesions. In addition, communicability centrality was shown to be widely affected by the lesions and the changes were negatively correlated with the distance from lesion site. In summary, our analysis suggests that communicability metrics that may provide an insight into the integrative properties of the structural brain network and that these metrics may be useful for the analysis of brain networks in the presence of lesions. Nevertheless, the interpretation of communicability is not straightforward; hence these metrics should be used as a supplement to the more standard connectivity network metrics.  相似文献   

5.
Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are indistinguishable. This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks.  相似文献   

6.
Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of nontreelike evolutionary events, like recombination, hybridization, or lateral gene transfer. While much progress has been made to find practical algorithms for reconstructing a phylogenetic network from a set of sequences, all attempts to endorse a class of phylogenetic networks (strictly extending the class of phylogenetic trees) with a well-founded distance measure have, to the best of our knowledge and with the only exception of the bipartition distance on regular networks, failed so far. In this paper, we present and study a new meaningful class of phylogenetic networks, called tree-child phylogenetic networks, and we provide an injective representation of these networks as multisets of vectors of natural numbers, their path multiplicity vectors. We then use this representation to define a distance on this class that extends the well-known Robinson-Foulds distance for phylogenetic trees and to give an alignment method for pairs of networks in this class. Simple polynomial algorithms for reconstructing a tree-child phylogenetic network from its path multiplicity vectors, for computing the distance between two tree-child phylogenetic networks and for aligning a pair of tree-child phylogenetic networks, are provided. They have been implemented as a Perl package and a Java applet, which can be found at http://bioinfo.uib.es/~recerca/phylonetworks/mudistance/.  相似文献   

7.
A phylogenetic network is a rooted acyclic digraph with vertices corresponding to taxa. Let X denote a set of vertices containing the root, the leaves, and all vertices of outdegree 1. Regard X as the set of vertices on which measurements such as DNA can be made. A vertex is called normal if it has one parent, and hybrid if it has more than one parent. The network is called normal if it has no redundant arcs and also from every vertex there is a directed path to a member of X such that all vertices after the first are normal. This paper studies properties of normal networks. Under a simple model of inheritance that allows homoplasies only at hybrid vertices, there is essentially unique determination of the genomes at all vertices by the genomes at members of X if and only if the network is normal. This model is a limiting case of more standard models of inheritance when the substitution rate is sufficiently low. Various mathematical properties of normal networks are described. These properties include that the number of vertices grows at most quadratically with the number of leaves and that the number of hybrid vertices grows at most linearly with the number of leaves.  相似文献   

8.
In protein-protein interaction (PPI) networks, functional similarity is often inferred based on the function of directly interacting proteins, or more generally, some notion of interaction network proximity among proteins in a local neighborhood. Prior methods typically measure proximity as the shortest-path distance in the network, but this has only a limited ability to capture fine-grained neighborhood distinctions, because most proteins are close to each other, and there are many ties in proximity. We introduce diffusion state distance (DSD), a new metric based on a graph diffusion property, designed to capture finer-grained distinctions in proximity for transfer of functional annotation in PPI networks. We present a tool that, when input a PPI network, will output the DSD distances between every pair of proteins. We show that replacing the shortest-path metric by DSD improves the performance of classical function prediction methods across the board.  相似文献   

9.
Rooted phylogenetic networks are primarily used to represent conflicting evolutionary information and describe the reticulate evolutionary events in phylogeny. So far a lot of methods have been presented for constructing rooted phylogenetic networks, of which the methods based on the decomposition property of networks and by means of the incompatible graph (such as the CASS, the LNETWORK and the BIMLR) are more efficient than other available methods. The paper will discuss and compare these methods by both the practical and artificial datasets, in the aspect of the running time of the methods and the effective of constructed phylogenetic networks. The results show that the LNETWORK can construct much simper networks than the others.  相似文献   

10.
A bridge role metric model is put forward in this paper. Compared with previous metric models, our solution of a large-scale object-oriented software system as a complex network is inherently more realistic. To acquire nodes and links in an undirected network, a new model that presents the crucial connectivity of a module or the hub instead of only centrality as in previous metric models is presented. Two previous metric models are described for comparison. In addition, it is obvious that the fitting curve between the results and degrees can well be fitted by a power law. The model represents many realistic characteristics of actual software structures, and a hydropower simulation system is taken as an example. This paper makes additional contributions to an accurate understanding of module design of software systems and is expected to be beneficial to software engineering practices.  相似文献   

11.
The problem of constructing an optimal rooted phylogenetic network from an arbitrary set of rooted triplets is an NP-hard problem. In this paper, we present a heuristic algorithm called TripNet, which tries to construct a rooted phylogenetic network with the minimum number of reticulation nodes from an arbitrary set of rooted triplets. Despite of current methods that work for dense set of rooted triplets, a key innovation is the applicability of TripNet to non-dense set of rooted triplets. We prove some theorems to clarify the performance of the algorithm. To demonstrate the efficiency of TripNet, we compared TripNet with SIMPLISTIC. It is the only available software which has the ability to return some rooted phylogenetic network consistent with a given dense set of rooted triplets. But the results show that for complex networks with high levels, the SIMPLISTIC running time increased abruptly. However in all cases TripNet outputs an appropriate rooted phylogenetic network in an acceptable time. Also we tetsed TripNet on the Yeast data. The results show that Both TripNet and optimal networks have the same clustering and TripNet produced a level-3 network which contains only one more reticulation node than the optimal network.  相似文献   

12.
Phylogenies—the evolutionary histories of groups of organisms—play a major role in representing the interrelationships among biological entities. Many methods for reconstructing and studying such phylogenies have been proposed, almost all of which assume that the underlying history of a given set of species can be represented by a binary tree. Although many biological processes can be effectively modeled and summarized in this fashion, others cannot: recombination, hybrid speciation, and horizontal gene transfer result in networks of relationships rather than trees of relationships. In previous works, we formulated a maximum parsimony (MP) criterion for reconstructing and evaluating phylogenetic networks, and demonstrated its quality on biological as well as synthetic data sets. In this paper, we provide further theoretical results as well as a very fast heuristic algorithm for the MP criterion of phylogenetic networks. In particular, we provide a novel combinatorial definition of phylogenetic networks in terms of “forbidden cycles,” and provide detailed hardness and hardness of approximation proofs for the "small” MP problem. We demonstrate the performance of our heuristic in terms of time and accuracy on both biological and synthetic data sets. Finally, we explain the difference between our model and a similar one formulated by Nguyen et al., and describe the implications of this difference on the hardness and approximation results.  相似文献   

13.
Trees are commonly utilized to describe the evolutionary history of a collection of biological species, in which case the trees are called phylogenetic trees. Often these are reconstructed from data by making use of distances between extant species corresponding to the leaves of the tree. Because of increased recognition of the possibility of hybridization events, more attention is being given to the use of phylogenetic networks that are not necessarily trees. This paper describes the reconstruction of certain such networks from the tree-average distances between the leaves. For a certain class of phylogenetic networks, a polynomial-time method is presented to reconstruct the network from the tree-average distances. The method is proved to work if there is a single reticulation cycle.  相似文献   

14.
Currently, considerable interest exists with regard to the dissociation of close packed aminoacids within proteins, in the course of unfolding, which could result in either wet or dry moltenglobules. The progressive disjuncture of residues constituting the hydrophobic core ofcyclophilin from L. donovani (LdCyp) has been studied during the thermal unfolding of the molecule, by molecular dynamics simulations. LdCyp has been represented as a surface contactnetwork (SCN) based on the surface complementarity (Sm) of interacting residues within themolecular interior. The application of Sm to side chain packing within proteins make it a very sensitive indicator of subtle perturbations in packing, in the thermal unfolding of the protein. Network based metrics have been defined to track the sequential changes in the disintegration ofthe SCN spanning the hydrophobic core of LdCyp and these metrics prove to be highly sensitive compared to traditional metrics in indicating the increased conformational (and dynamical) flexibility in the network. These metrics have been applied to suggest criteria distinguishing DMG, WMG and transition state ensembles and to identify key residues involved in crucial conformational/topological events during the unfolding process.  相似文献   

15.
It is known that the Kimura 3ST model of sequence evolution on phylogenetic trees can be extended quite naturally to arbitrary split systems. However, this extension relies heavily on mathematical peculiarities of the associated Hadamard transformation, and providing an analogous augmentation of the general Markov model has thus far been elusive. In this paper, we rectify this shortcoming by showing how to extend the general Markov model on trees to include incompatible edges; and even further to more general network models. This is achieved by exploring the algebra of the generators of the continuous-time Markov chain together with the “splitting” operator that generates the branching process on phylogenetic trees. For simplicity, we proceed by discussing the two state case and then show that our results are easily extended to more states with little complication. Intriguingly, upon restriction of the two state general Markov model to the parameter space of the binary symmetric model, our extension is indistinguishable from the Hadamard approach only on trees; as soon as any incompatible splits are introduced the two approaches give rise to differing probability distributions with disparate structure. Through exploration of a simple example, we give an argument that our extension to more general networks has desirable properties that the previous approaches do not share. In particular, our construction allows for convergent evolution of previously divergent lineages; a property that is of significant interest for biological applications.  相似文献   

16.
Phototrophy, the conversion of light to biochemical energy, occurs throughout the Bacteria and plants, however, debate continues over how different phototrophic mechanisms and the bacteria that contain them are related. There are two types of phototrophic mechanisms in the Bacteria: reaction center type 1 (RC1) has core and core antenna domains that are parts of a single polypeptide, whereas reaction center type 2 (RC2) is composed of short core proteins without antenna domains. In cyanobacteria, RC2 is associated with separate core antenna proteins that are homologous to the core antenna domains of RC1. We reconstructed evolutionary relationships among phototrophic mechanisms based on a phylogeny of core antenna domains/proteins. Core antenna domains of 46 polypeptides were aligned, including the RC1 core proteins of heliobacteria, green sulfur bacteria, and photosystem I (PSI) of cyanobacteria and plastids, plus core antenna proteins of photosystem II (PSII) from cyanobacteria and plastids. Maximum likelihood, parsimony, and neighbor joining methods all supported a single phylogeny in which PSII core antenna proteins (PsbC, PsbB) arose within the cyanobacteria from duplications of the RC1-associated core antenna domains and accessory antenna proteins (IsiA, PcbA, PcbC) arose from duplications of PsbB. The data indicate an evolutionary history of RC1 in which an initially homodimeric reaction center was vertically transmitted to green sulfur bacteria, heliobacteria, and an ancestor of cyanobacteria. A heterodimeric RC1 (=PSI) then arose within the cyanobacterial lineage. In this scenario, the current diversity of core antenna domains/proteins is explained without a need to invoke horizontal transfer.This article contains online-only supplementary material.Reviewing Editor: Dr. W. Ford Doolittle  相似文献   

17.
In this paper, a class of rooted acyclic directed graphs (called TOM-networks) is defined that generalizes rooted trees and allows for models including hybridization events. It is argued that the defining properties are biologically plausible. Each TOM-network has a distance defined between each pair of vertices. For a TOM-network N, suppose that the set X consisting of the leaves and the root is known, together with the distances between members of X. It is proved that N is uniquely determined from this information and can be reconstructed in polynomial time. Thus, given exact distance information on the leaves and root, the phylogenetic network can be uniquely recovered, provided that it is a TOM-network. An outgroup can be used instead of a true root.  相似文献   

18.
We investigate a recent methodology we have proposed to extract valuable information on the competitiveness of countries and complexity of products from trade data. Standard economic theories predict a high level of specialization of countries in specific industrial sectors. However, a direct analysis of the official databases of exported products by all countries shows that the actual situation is very different. Countries commonly considered as developed ones are extremely diversified, exporting a large variety of products from very simple to very complex. At the same time countries generally considered as less developed export only the products also exported by the majority of countries. This situation calls for the introduction of a non-monetary and non-income-based measure for country economy complexity which uncovers the hidden potential for development and growth. The statistical approach we present here consists of coupled non-linear maps relating the competitiveness/fitness of countries to the complexity of their products. The fixed point of this transformation defines a metrics for the fitness of countries and the complexity of products. We argue that the key point to properly extract the economic information is the non-linearity of the map which is necessary to bound the complexity of products by the fitness of the less competitive countries exporting them. We present a detailed comparison of the results of this approach directly with those of the Method of Reflections by Hidalgo and Hausmann, showing the better performance of our method and a more solid economic, scientific and consistent foundation.  相似文献   

19.
Species in biology are traditionally perceived as kinds of organisms about which explanatory and predictive generalizations can be made, and biologists commonly use species in this manner. This perception of species is, however, in stark contrast with the currently accepted view that species are not kinds or classes at all, but individuals. In this paper I investigate the conditions under which the two views of species might be held simultaneously. Specifically, I ask whether upon acceptance of an ontology of species as diachronic segments of the tree of life (this is one version of the species as individuals ontology) species can perform the epistemic role of kinds of organisms to which explanatory and predictive generalizations apply. I show that, for species-level segments of the tree of life, several requirements have to be met before the performance of this epistemic role is possible, and I argue that these requirements can be met by defining species according to the Composite Species Concept proposed by Kornet and McAllister in the 1990s.  相似文献   

20.
A lipidome is the set of lipids in a given organism, cell or cell compartment and this set reflects the organism’s synthetic pathways and interactions with its environment. Recently, lipidomes of biological model organisms and cell lines were published and the number of functional studies of lipids is increasing. In this study we propose a homology metric that can quantify systematic differences in the composition of a lipidome. Algorithms were developed to 1. consistently convert lipids structure into SMILES, 2. determine structural similarity between molecular species and 3. describe a lipidome in a chemical space model. We tested lipid structure conversion and structure similarity metrics, in detail, using sets of isomeric ceramide molecules and chemically related phosphatidylinositols. Template-based SMILES showed the best properties for representing lipid-specific structural diversity. We also show that sequence analysis algorithms are best suited to calculate distances between such template-based SMILES and we adjudged the Levenshtein distance as best choice for quantifying structural changes. When all lipid molecules of the LIPIDMAPS structure database were mapped in chemical space, they automatically formed clusters corresponding to conventional chemical families. Accordingly, we mapped a pair of lipidomes into the same chemical space and determined the degree of overlap by calculating the Hausdorff distance. We named this metric the ‘Lipidome jUXtaposition (LUX) score’. First, we tested this approach for estimating the lipidome similarity on four yeast strains with known genetic alteration in fatty acid synthesis. We show that the LUX score reflects the genetic relationship and growth temperature better than conventional methods although the score is based solely on lipid structures. Next, we applied this metric to high-throughput data of larval tissue lipidomes of Drosophila. This showed that the LUX score is sufficient to cluster tissues and determine the impact of nutritional changes in an unbiased manner, despite the limited information on the underlying structural diversity of each lipidome. This study is the first effort to define a lipidome homology metric based on structures that will enrich functional association of lipids in a similar manner to measures used in genetics. Finally, we discuss the significance of the LUX score to perform comparative lipidome studies across species borders.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号