共查询到20条相似文献,搜索用时 15 毫秒
1.
van Iersel Leo Keijsper Judith Kelk Steven Stougie Leen Hagen Ferry Boekhout Teun 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(4):667-681
Jansson and Sung showed that, given a dense set of input triplets T (representing hypotheses about the local evolutionary relationships of triplets of taxa), it is possible to determine in polynomial time whether there exists a level-1 network consistent with T, and if so, to construct such a network [24]. Here, we extend this work by showing that this problem is even polynomial time solvable for the construction of level-2 networks. This shows that, assuming density, it is tractable to construct plausible evolutionary histories from input triplets even when such histories are heavily nontree-like. This further strengthens the case for the use of triplet-based methods in the construction of phylogenetic networks. We also implemented the algorithm and applied it to yeast data. 相似文献
2.
Juan Wang 《PloS one》2016,11(11)
Rooted phylogenetic networks are primarily used to represent conflicting evolutionary information and describe the reticulate evolutionary events in phylogeny. So far a lot of methods have been presented for constructing rooted phylogenetic networks, of which the methods based on the decomposition property of networks and by means of the incompatible graph (such as the CASS, the LNETWORK and the BIMLR) are more efficient than other available methods. The paper will discuss and compare these methods by both the practical and artificial datasets, in the aspect of the running time of the methods and the effective of constructed phylogenetic networks. The results show that the LNETWORK can construct much simper networks than the others. 相似文献
3.
Huson Daniel H. 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(1):103-109
The evolutionary history of a collection of species is usually represented by a phylogenetic tree. Sometimes, phylogenetic networks are used as a means of representing reticulate evolution or of showing uncertainty and incompatibilities in evolutionary datasets. This is often done using unrooted phylogenetic networks such as split networks, due in part, to the availability of software (SplitsTree) for their computation and visualization. In this paper we discuss the problem of drawing rooted phylogenetic networks as cladograms or phylograms in a number of different views that are commonly used for rooted trees. Implementations of the algorithms are available in new releases of the Dendroscope and SplitsTree programs. 相似文献
4.
Cardona Gabriel Llabres Merce Rossello Francesc Valiente Gabriel 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(3):454-469
The assessment of phylogenetic network reconstruction methods requires the ability to compare phylogenetic networks. This is the second in a series of papers devoted to the analysis and comparison of metrics for tree-child time consistent phylogenetic networks on the same set of taxa. In this paper, we generalize to phylogenetic networks two metrics that have already been introduced in the literature for phylogenetic trees: the nodal distance and the triplets distance. We prove that they are metrics on any class of tree-child time consistent phylogenetic networks on the same set of taxa, as well as some basic properties for them. To prove these results, we introduce a reduction/expansion procedure that can be used not only to establish properties of tree-child time consistent phylogenetic networks by induction, but also to generate all tree-child time consistent phylogenetic networks with a given number of leaves. 相似文献
5.
Leo van Iersel Vincent Moulton Eveline de Swart Taoyang Wu 《Bulletin of mathematical biology》2017,79(5):1135-1154
Phylogenetic networks are a generalization of evolutionary trees that are used by biologists to represent the evolution of organisms which have undergone reticulate evolution. Essentially, a phylogenetic network is a directed acyclic graph having a unique root in which the leaves are labelled by a given set of species. Recently, some approaches have been developed to construct phylogenetic networks from collections of networks on 2- and 3-leaved networks, which are known as binets and trinets, respectively. Here we study in more depth properties of collections of binets, one of the simplest possible types of networks into which a phylogenetic network can be decomposed. More specifically, we show that if a collection of level-1 binets is compatible with some binary network, then it is also compatible with a binary level-1 network. Our proofs are based on useful structural results concerning lowest stable ancestors in networks. In addition, we show that, although the binets do not determine the topology of the network, they do determine the number of reticulations in the network, which is one of its most important parameters. We also consider algorithmic questions concerning binets. We show that deciding whether an arbitrary set of binets is compatible with some network is at least as hard as the well-known graph isomorphism problem. However, if we restrict to level-1 binets, it is possible to decide in polynomial time whether there exists a binary network that displays all the binets. We also show that to find a network that displays a maximum number of the binets is NP-hard, but that there exists a simple polynomial-time 1/3-approximation algorithm for this problem. It is hoped that these results will eventually assist in the development of new methods for constructing phylogenetic networks from collections of smaller networks. 相似文献
6.
A Differential Method for Estimation of Type Frequencies in Triplets and Quadruplets 总被引:2,自引:1,他引:2 下载免费PDF全文
Gordon Allen 《American journal of human genetics》1960,12(2):210-224
7.
Cardona Gabriel Llabres Merce Rossello Francesc Valiente Gabriel 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(4):629-638
We prove that Nakhleh's metric for reduced phylogenetic networks is also a metric on the classes of tree-child phylogenetic networks, semibinary tree-sibling time consistent phylogenetic networks, and multilabeled phylogenetic trees. We also prove that it separates distinguishable phylogenetic networks. In this way, it becomes the strongest dissimilarity measure for phylogenetic networks available so far. Furthermore, we propose a generalization of that metric that separates arbitrary phylogenetic networks. 相似文献
8.
A contemporary and fundamental problem faced by many evolutionary biologists is how to puzzle together a collection ℘ of partial
trees (leaf-labeled trees whose leaves are bijectively labeled by species or, more generally, taxa, each supported by, e.g.,
a gene) into an overall parental structure that displays all trees in ℘. This already difficult problem is complicated by
the fact that the trees in ℘ regularly support conflicting phylogenetic relationships and are not on the same but only overlapping
taxa sets. A desirable requirement on the sought after parental structure, therefore, is that it can accommodate the observed
conflicts. Phylogenetic networks are a popular tool capable of doing precisely this. However, not much is known about how
to construct such networks from partial trees, a notable exception being the Z-closure super-network approach, which is based on the Z-closure rule, and the Q-imputation approach. Although attractive approaches, they both suffer from the fact that the generated networks tend to be
multidimensional making it necessary to apply some kind of filter to reduce their complexity.
To avoid having to resort to a filter, we follow a different line of attack in this paper and develop closure rules for generating
circular phylogenetic networks which have the attractive property that they can be represented in the plane. In particular, we introduce
the novel Y-(closure) rule and show that this rule on its own or in combination with one of Meacham’s closure rules (which we call the
M-rule) has some very desirable theoretical properties. In addition, we present a case study based on Rivera et al. “ring of
life” to explore the reconstructive power of the M- and Y-rule and also reanalyze an Arabidopsis thaliana data set. 相似文献
9.
Stephen J. Willson 《Bulletin of mathematical biology》2013,75(10):1840-1878
Trees are commonly utilized to describe the evolutionary history of a collection of biological species, in which case the trees are called phylogenetic trees. Often these are reconstructed from data by making use of distances between extant species corresponding to the leaves of the tree. Because of increased recognition of the possibility of hybridization events, more attention is being given to the use of phylogenetic networks that are not necessarily trees. This paper describes the reconstruction of certain such networks from the tree-average distances between the leaves. For a certain class of phylogenetic networks, a polynomial-time method is presented to reconstruct the network from the tree-average distances. The method is proved to work if there is a single reticulation cycle. 相似文献
10.
Cardona Gabriel Llabr s Merc Rossell Francesc Valiente Gabriel 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(1):46-61
The assessment of phylogenetic network reconstruction methods requires the ability to compare phylogenetic networks. This is the first in a series of papers devoted to the analysis and comparison of metrics for tree-child time consistent phylogenetic networks on the same set of taxa. In this paper, we study three metrics that have already been introduced in the literature: the Robinson-Foulds distance, the tripartitions distance and the $mu$-distance. They generalize to networks the classical Robinson-Foulds or partition distance for phylogenetic trees. We analyze the behavior of these metrics by studying their least and largest values and when they achieve them. As a by-product of this study, we obtain tight bounds on the size of a tree-child time consistent phylogenetic network. 相似文献
11.
一种构建改形单域抗体的方法 总被引:2,自引:0,他引:2
为验证一种构建改形单域抗体的实用新方法,与以往方法不同的是,该方法不需要对抗体进行空间结构模拟,以确定人源抗体的FRs接受序列及在人源FRs接受序列中哪些氨基酸残基需要突变,并且该方法将抗体的改形与亲和力成熟于同一过程完成,利用该方法构建了改形抗CD28重链单域抗体,根据一种鼠源抗CD28重链单域抗体的氨基酸序列,于GenBank中查得两条与之最同源的人源抗体序列,利用其中一条的FRs作为改形抗体的主框架进行改形构建,将鼠源抗体的CDR区插入到人源FR区后,对人源FR区的一些氨基酸残基进行替换突变,替换的氨基酸残基数及替换原则主要是根据对查到的人源抗体序列,鼠源抗体序列,以及这些序列与Kabat分类中的种属序列进行的比较,为了增加改形抗体基因的多样性,对要被替换的氨基酸残基在基因合成中采用简并的方式,使要被替换的氨基酸残基和替换的氨基酸残基都有机会出现,二者出现的几率各为50%,同时,在将大小不同的合成核苷酸片段采用重叠PCR扩增以获得完整改形抗体基因时,采用高Mg^2 浓度下和使用TaqDNA聚合酶,以进一步随机引入突变,利用重叠PCR产物构建了一个噬菌体抗体库,经过3轮淘选后,获得了几个具有较高免疫学活性的改形抗体,对其中的两个抗体进行了进一步研究,将两个抗体的基因在大肠杆菌BL21(DE3)中表达,复性后的表达蛋白仍具有较高的免疫学活性,结果表明该方法是有效可行的。 相似文献
12.
对传统构建重组杆状病毒的方法作了如下改进:先用磷酸钙共沉淀法单独将质粒DNA转进昆虫Sf细胞中,其中重组质粒采用聚乙二醇沉淀法纯化,12~24h后再用低剂量的病毒攻击细胞.改进后的方法简便、省时、经济、重组率高,适于一般实验室使用. 相似文献
13.
Willson SJ 《Bulletin of mathematical biology》2006,68(4):919-944
In this paper, a class of rooted acyclic directed graphs (called TOM-networks) is defined that generalizes rooted trees and
allows for models including hybridization events. It is argued that the defining properties are biologically plausible. Each
TOM-network has a distance defined between each pair of vertices. For a TOM-network N, suppose that the set X consisting of the leaves and the root is known, together with the distances between members of X. It is proved that N is uniquely determined from this information and can be reconstructed in polynomial time. Thus, given exact distance information
on the leaves and root, the phylogenetic network can be uniquely recovered, provided that it is a TOM-network. An outgroup
can be used instead of a true root. 相似文献
14.
Willson SJ 《Bulletin of mathematical biology》2007,69(8):2561-2590
Suppose G is a phylogenetic network given as a rooted acyclic directed graph. Let X be a subset of the vertex set containing the root, all leaves, and all vertices of outdegree 1. A vertex is “regular” if
it has a unique parent, and “hybrid” if it has two parents. Consider the case where each gene is binary. Assume an idealized
system of inheritance in which no homoplasies occur at regular vertices, but homoplasies can occur at hybrid vertices. Under
our model, the distances between taxa are shown to be described using a system of numbers called “originating weights” and
“homoplasy weights.” Assume that the distances are known between all members of X. Sufficient conditions are given such that the graph G and all the originating and homoplasy weights can be reconstructed from the given distances. 相似文献
15.
Kansuporn Sriyudthsak Michio Iwata Masami Yokota Hirai Fumihide Shiraishi 《Bulletin of mathematical biology》2014,76(6):1333-1351
The availability of large-scale datasets has led to more effort being made to understand characteristics of metabolic reaction networks. However, because the large-scale data are semi-quantitative, and may contain biological variations and/or analytical errors, it remains a challenge to construct a mathematical model with precise parameters using only these data. The present work proposes a simple method, referred to as PENDISC ( arameter stimation in a on- mensionalized -system with onstraints), to assist the complex process of parameter estimation in the construction of a mathematical model for a given metabolic reaction system. The PENDISC method was evaluated using two simple mathematical models: a linear metabolic pathway model with inhibition and a branched metabolic pathway model with inhibition and activation. The results indicate that a smaller number of data points and rate constant parameters enhances the agreement between calculated values and time-series data of metabolite concentrations, and leads to faster convergence when the same initial estimates are used for the fitting. This method is also shown to be applicable to noisy time-series data and to unmeasurable metabolite concentrations in a network, and to have a potential to handle metabolome data of a relatively large-scale metabolic reaction system. Furthermore, it was applied to aspartate-derived amino acid biosynthesis in Arabidopsis thaliana plant. The result provides confirmation that the mathematical model constructed satisfactorily agrees with the time-series datasets of seven metabolite concentrations. 相似文献
16.
构建分子标记连锁图谱的一种新方法:三点自交法 总被引:9,自引:0,他引:9
作者从数学上导出了基因作图的三点自交方法。这一方法同用三点测交法一样能提供各种作图信息,但不需要选育三隐性纯合基因亲本或品系,因而能大大提高作图功效。,从理论上证明,该方法也适合于小群体作图分子标记连锁图谱,同时用Fisher单一观察信息(即F信息)量证明,三点自交法是一种有效的作图方法,应用MAPMAKER程序中所提供的才鼠F2群体中333个个体的12个RFLP标记位点中前6个位点的数据对三点自交图图距计算具有与MAPMAKER程序一样的功能,而且还提供了位点间的交叉干涉和位点的相引或相斥构型等信息以及紧密位点间发生负干涉作用的证据。 相似文献
17.
Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are indistinguishable. This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks. 相似文献
18.
We introduce here a new method for computing differences between microbial communities based on phylogenetic information. This method, UniFrac, measures the phylogenetic distance between sets of taxa in a phylogenetic tree as the fraction of the branch length of the tree that leads to descendants from either one environment or the other, but not both. UniFrac can be used to determine whether communities are significantly different, to compare many communities simultaneously using clustering and ordination techniques, and to measure the relative contributions of different factors, such as chemistry and geography, to similarities between samples. We demonstrate the utility of UniFrac by applying it to published 16S rRNA gene libraries from cultured isolates and environmental clones of bacteria in marine sediment, water, and ice. Our results reveal that (i) cultured isolates from ice, water, and sediment resemble each other and environmental clone sequences from sea ice, but not environmental clone sequences from sediment and water; (ii) the geographical location does not correlate strongly with bacterial community differences in ice and sediment from the Arctic and Antarctic; and (iii) bacterial communities differ between terrestrially impacted seawater (whether polar or temperate) and warm oligotrophic seawater, whereas those in individual seawater samples are not more similar to each other than to those in sediment or ice samples. These results illustrate that UniFrac provides a new way of characterizing microbial communities, using the wealth of environmental rRNA sequences, and allows quantitative insight into the factors that underlie the distribution of lineages among environments. 相似文献
19.
Our ability to construct very large phylogenetic trees is becoming more important as vast amounts of sequence data are becoming
readily available. Neighbor joining (NJ) is a widely used distance-based phylogenetic tree construction method that has historically
been considered fast, but it is prohibitively slow for building trees from increasingly large datasets. We developed a fast
variant of NJ called relaxed neighbor joining (RNJ) and performed experiments to measure the speed improvement over NJ. Since
repeated runs of the RNJ algorithm generate a superset of the trees that repeated NJ runs generate, we also assessed tree
quality. RNJ is dramatically faster than NJ, and the quality of resulting trees is very similar for the two algorithms. The
results indicate that RNJ is a reasonable alternative to NJ and that it is especially well suited for uses that involve large
numbers of taxa or highly repetitive procedures such as bootstrapping.
[Reviewing Editor: Dr. James Bull] 相似文献
20.
We develop a maximum penalized-likelihood (MPL) method to estimate the fitnesses of amino acids and the distribution of selection coefficients (S = 2Ns) in protein-coding genes from phylogenetic data. This improves on a previous maximum-likelihood method. Various penalty functions are used to penalize extreme estimates of the fitnesses, thus correcting overfitting by the previous method. Using a combination of computer simulation and real data analysis, we evaluate the effect of the various penalties on the estimation of the fitnesses and the distribution of S. We show the new method regularizes the estimates of the fitnesses for small, relatively uninformative data sets, but it can still recover the large proportion of deleterious mutations when present in simulated data. Computer simulations indicate that as the number of taxa in the phylogeny or the level of sequence divergence increases, the distribution of S can be more accurately estimated. Furthermore, the strength of the penalty can be varied to study how informative a particular data set is about the distribution of S. We analyze three protein-coding genes (the chloroplast rubisco protein, mammal mitochondrial proteins, and an influenza virus polymerase) and show the new method recovers a large proportion of deleterious mutations in these data, even under strong penalties, confirming the distribution of S is bimodal in these real data. We recommend the use of the new MPL approach for the estimation of the distribution of S in species phylogenies of protein-coding genes. 相似文献