首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Parsimony methods infer phylogenetic trees by minimizing number of character changes required to explain observed character states. From the perspective of applicability of parsimony methods, it is important to assess whether the characters used to infer phylogeny are likely to provide a correct tree. We introduce a graph theoretical characterization that helps to assess whether given set of characters is appropriate to use with parsimony methods. Given a set of characters and a set of taxa, we construct a network called character overlap graph. We show that the character overlap graph for characters that are appropriate to use in parsimony methods is characterized by significant under-representation of subnetworks known as holes, and provide a validation for this observation. This characterization explains success in constructing evolutionary trees using parsimony method for some characters (e.g., protein domains) and lack of such success for other characters (e.g., introns). In the latter case, the understanding of obstacles to applying parsimony methods in a direct way has lead us to a new approach for detecting inconsistent and/or noisy data. Namely, we introduce the concept of stable characters which is similar but less restrictive than the well known concept of pairwise compatible characters. Application of this approach to introns produces the evolutionary tree consistent with the Coelomata hypothesis.  相似文献   

2.
Rooted phylogenetic trees constructed from different datasets (e.g. from different genes) are often conflicting with one another, i.e. they cannot be integrated into a single phylogenetic tree. Phylogenetic networks have become an important tool in molecular evolution, and rooted phylogenetic networks are able to represent conflicting rooted phylogenetic trees. Hence, the development of appropriate methods to compute rooted phylogenetic networks from rooted phylogenetic trees has attracted considerable research interest of late. The CASS algorithm proposed by van Iersel et al. is able to construct much simpler networks than other available methods, but it is extremely slow, and the networks it constructs are dependent on the order of the input data. Here, we introduce an improved CASS algorithm, BIMLR. We show that BIMLR is faster than CASS and less dependent on the input data order. Moreover, BIMLR is able to construct much simpler networks than almost all other methods. BIMLR is available at http://nclab.hit.edu.cn/wangjuan/BIMLR/.  相似文献   

3.
Recently, much attention has been devoted to the construction of phylogenetic networks which generalize phylogenetic trees in order to accommodate complex evolutionary processes. Here, we present an efficient, practical algorithm for reconstructing level-1 phylogenetic networks--a type of network slightly more general than a phylogenetic tree--from triplets. Our algorithm has been made publicly available as the program LEV1ATHAN. It combines ideas from several known theoretical algorithms for phylogenetic tree and network reconstruction with two novel subroutines. Namely, an exponential-time exact and a greedy algorithm both of which are of independent theoretical interest. Most importantly, LEV1ATHAN runs in polynomial time and always constructs a level-1 network. If the data are consistent with a phylogenetic tree, then the algorithm constructs such a tree. Moreover, if the input triplet set is dense and, in addition, is fully consistent with some level-1 network, it will find such a network. The potential of LEV1ATHAN is explored by means of an extensive simulation study and a biological data set. One of our conclusions is that LEV1ATHAN is able to construct networks consistent with a high percentage of input triplets, even when these input triplets are affected by a low to moderate level of noise.  相似文献   

4.
5.
Perfect phylogenetic networks with recombination.   总被引:1,自引:0,他引:1  
The perfect phylogeny problem is a classical problem in evolutionary tree construction. In this paper, we propose a new model called phylogenetic network with recombination that takes recombination events into account. We show that the problem of finding a perfect phylogenetic network with the minimum number of recombination events is NP-hard; we also present an efficient polynomial time algorithm for an interesting restricted version of the problem.  相似文献   

6.
Now that large-scale genome-sequencing projects are sampling many organismal lineages, it is becoming possible to compare large data sets of not only DNA and protein sequences, but also genome-level features, such as gene arrangements and the positions of mobile genetic elements. Although it is unlikely that comparisons of such features will address a large number of evolutionary branch points across the broad tree of life owing to the infeasibility of such sampling, they have great potential for resolving many crucial, contested relationships for which no other data seem promising. Here, I discuss the advancements, advantages, methods, and problems of the use of genome-level characters for reconstructing evolutionary relationships.  相似文献   

7.
8.
The comparative approach is routinely used to test for possible correlations between phenotypic or life-history traits. To correct for phylogenetic inertia, the method of independent contrasts assumes that continuous characters evolve along the phylogeny according to a multivariate Brownian process. Brownian diffusion processes have also been used to describe time variations of the parameters of the substitution process, such as the rate of substitution or the ratio of synonymous to nonsynonymous substitutions. Here, we develop a probabilistic framework for testing the coupling between continuous characters and parameters of the molecular substitution process. Rates of substitution and continuous characters are jointly modeled as a multivariate Brownian diffusion process of unknown covariance matrix. The covariance matrix, the divergence times and the phylogenetic variations of substitution rates and continuous characters are all jointly estimated in a Bayesian Monte Carlo framework, imposing on the covariance matrix a prior conjugate to the Brownian process so as to achieve a greater computational efficiency. The coupling between rates and phenotypes is assessed by measuring the posterior probability of positive or negative covariances, whereas divergence dates and phenotypic variations are marginally reconstructed in the context of the joint analysis. As an illustration, we apply the model to a set of 410 mammalian cytochrome b sequences. We observe a negative correlation between the rate of substitution and mass and longevity, which was previously observed. We also find a positive correlation between ω = dN/dS and mass and longevity, which we interpret as an indirect effect of variations of effective population size, thus in partial agreement with the nearly neutral theory. The method can easily be extended to any parameter of the substitution process and to any continuous phenotypic or environmental character.  相似文献   

9.
Phylogenetic networks were introduced to describe evolution in the presence of exchanges of genetic material between coexisting species or individuals. Split networks in particular were introduced as a special kind of abstract network to visualize conflicts between phylogenetic trees which may correspond to such exchanges. More recently, methods were designed to reconstruct explicit phylogenetic networks (whose vertices can be interpreted as biological events) from triplet data. In this article, we link abstract and explicit networks through their combinatorial properties, by introducing the unrooted analog of level-k networks. In particular, we give an equivalence theorem between circular split systems and unrooted level-1 networks. We also show how to adapt to quartets some existing results on triplets, in order to reconstruct unrooted level-k phylogenetic networks. These results give an interesting perspective on the combinatorics of phylogenetic networks and also raise algorithmic and combinatorial questions.  相似文献   

10.
In the 1990s Rubiaceae became a hot spot for systematists, mainly due to the comprehensive treatment of the family by Robbrecht in 1988. Next to the exploration of macromolecular characters to infer the phylogeny, the palynology of Rubiaceae finally received the attention it deserves. This article aims to present a state-of-the-art analysis of the systematic palynology of the family. The range of varíation in pollen morphology is wide, and some of the pollen features are not known from other angiosperm taxa; e.g., a looplike or spiral pattern for the position of apertures in pantoaperturate grains. We compiled an online database at the generic level for the major pollen characters and orbicule presence in Rubiaceae. An overview of the variation is presented here and illustrated per character: dispersal unit, pollen size and shape, aperture number, position and type, sexine ornamentation, nexine pattern, and stratification of the sporoderm. The presence/absence and morphological variation of orbicules at the generic level is provided as well. The systematic usefulness of pollen morphology in Rubiaceae is discussed at the (sub)family, tribal, generic, and infraspecific levels, using up-to-date evolutionary hypotheses for the different lineages in the family. The problems and opportunities of coding pollen characters for cladistic analyses are also treated.  相似文献   

11.
The study of species co-occurrences has been central in community ecology since the foundation of the discipline. Co-occurrence data are, nevertheless, a neglected source of information to model species distributions and biogeographers are still debating about the impact of biotic interactions on species distributions across geographical scales. We argue that a theory of species co-occurrence in ecological networks is needed to better inform interpretation of co-occurrence data, to formulate hypotheses for different community assembly mechanisms, and to extend the analysis of species distributions currently focused on the relationship between occurrences and abiotic factors. The main objective of this paper is to provide the first building blocks of a general theory for species co-occurrences. We formalize the problem with definitions of the different probabilities that are studied in the context of co-occurrence analyses. We analyze three species interactions modules and conduct multi-species simulations in order to document five principles influencing the associations between species within an ecological network: (i) direct interactions impact pairwise co-occurrence, (ii) indirect interactions impact pairwise co-occurrence, (iii) pairwise co-occurrence rarely are symmetric, (iv) the strength of an association decreases with the length of the shortest path between two species, and (v) the strength of an association decreases with the number of interactions a species is experiencing. Our analyses reveal the difficulty of the interpretation of species interactions from co-occurrence data. We discuss whether the inference of the structure of interaction networks is feasible from co-occurrence data. We also argue that species distributions models could benefit from incorporating conditional probabilities of interactions within the models as an attempt to take into account the contribution of biotic interactions to shaping individual distributions of species.  相似文献   

12.
13.
T-REX (tree and reticulogram reconstruction) is an application to reconstruct phylogenetic trees and reticulation networks from distance matrices. The application includes a number of tree fitting methods like NJ, UNJ or ADDTREE which have been very popular in phylogenetic analysis. At the same time, the software comprises several new methods of phylogenetic analysis such as: tree reconstruction using weights, tree inference from incomplete distance matrices or modeling a reticulation network for a collection of objects or species. T-REX also allows the user to visualize obtained tree or network structures using Hierarchical, Radial or Axial types of tree drawing and manipulate them interactively. AVAILABILITY: T-REX is a freeware package available online at: http://www.fas.umontreal.ca/biol/casgrain/en/labo/t-rex  相似文献   

14.
15.
It is shown that the mathematical methods hitherto used for solving pattern recognition problems are usually not well adapted with respect to the natural class-forming processes of patterns generated from objects of the real world. Especially the well known method using a multidimensional signal space is not suited for typical pattern recognition problems, although its value is unobjected for signal detection problems. The paper proposes to regard the class-forming process of patterns with the aid of spatial transformations, where here the three-dimensional space of the real world is considered. Accordingly, a space distortion theory for both two-dimensional and three-dimensional objects, respectively, has been developed. It leads to recognition schemes using a generalized correlation technique. The generalization consists of an adaptive spatial transformation process prior to correlation. By this way, pattern recognition is supposed to be an adaptive control process using optimization methods. The decomposition theory proposes the application of this method on separate parts of the decomposed picture.
Zusammenfassung Die bekannten mathematischen Methoden der Mustererkennung benutzen für die Signaldarstellung einen Raum hoher Dimension, den sog. Nachrichtenraum, dessen Koordinaten die Abtastwerte (oder Bildpunkte) des diskret dargestellten Signals sind. Die notwendige Vorverarbeitung (Merkmalsextraktion) und die anschließende Klassifizierung werden als Operation in diesem Nachrichtenraum durchgeführt.In dieser Arbeit wird gezeigt, daß der Nachrichtenraum für Mustererkennungsaufgaben nicht trivialer Art, wie z. B. handgeschriebene Zeichen oder geometrische Objekte in beliebiger Lage im Raum eine ungeeignete Darstellung des Problems ist. Dies beruht darauf, daß die Verteilung der zu den verschiedenen Klassen gehörenden Muster als Punkte im Nachrichtenraum im allgemeinen durchmischt und daher nicht separabel ist. Daran ändern auch die oft angewandten linearen Transformationen nichts (z. B. FourierTransformation, Loève-Karhunen-Transformation, HadamardTransformation), da diese nur eine Translation, Rotation sowie eine lineare Verzerrung des Nachrichtenraumes bewirken können, ohne seine innere Struktur zu verändern. Es wird daher in dieser Arbeit vorgeschlagen, anstelle des vieldimensionalen Nachrichtenraumes den Raum der realen Welt zu betrachten und in diesem Koordinatentransformationen durchzuführen. Die einfachsten Transformationen dieser Art sind die geometrischen Transformationen infolge der Lageveränderung der Objekte. Sie haben 6 Freiheitsgrade, nämlich 3 Translationen und 3 Rotationen entsprechend den 3 Raumkoordinaten. Bei Einbeziehung der linearen Vergrößerung des Objekts unter Beibehaltung der Form ergeben sich 7 Freiheitsgrade, die wiederum bei Vernachlässigung der perspektivischen Verzerrungen des Objektbildes auf 6 reduziert werden können. Das nach dieser Theorie vorgeschlagene Erkennungssystem führt diese geometrischen Transformationen bzw. Rücktransformationen in adaptiver Weise durch, wobei als Indikator für die Annäherung eine Kreuzkorrelation mit einem die Musterklasse repräsentierenden idealen Prototyp vorgesehen ist. Bei räumlichen Objekten kann für diese Korrelation nur die Projektion des Objektes auf das zweidimensionale Bildfeld verwendet werden, wodurch gewisse Mehrdeutigkeiten nicht absolut auszuschließen sind, es sei denn, das Objekt wird bewegt oder von verschiedenen Seiten her betrachtet. Das Erkennungssystem führt somit eine geometrische Transformation nach 6 oder 7 Freiheitsgraden durch, die adaptiv optimiert wird. Anders ausgedrückt wird ein räumlich-geometrisches Zielfindungsverfahren (tracking) durchgeführt, mit dem der Mustererkennungsvorgang untrennbar verbunden ist. Es ist zu erwarten, daß dieses Verfahren sowohl für planare Objekte, wie z. B. handgeschriebene Zeichen, als auch für räumliche Objekte bei einer relativ einfachen Realisierung zu wesentlich besseren Ergebnissen führt als die bisherigen auf Transformationen im Nachrichtenraum beruhenden Verfahren.
  相似文献   

16.
Vilhelmsen L 《ZooKeys》2011,(130):343-361
The head capsule of a taxon sample of three outgroup and 86 ingroup taxa is examined for characters of possible phylogenetic significance within Hymenoptera. 21 morphological characters are illustrated and scored, and their character evolution explored by mapping them onto a phylogeny recently produced from a large morphological data set. Many of the characters are informative and display unambiguous changes. Most of the character support demonstrated is supportive at the superfamily or family level. In contrast, only few characters corroborate deeper nodes in the phylogeny of Hymenoptera.  相似文献   

17.
Cladistic analyses of 17 wild and cultivated pea taxa were performed using morphological characters, and allozyme and RAPD (random amplified polymorphic DNA) markers. Both branch-and-bound and bootstrap searches produced cladograms that confirmed the close relationships among the wild species and cultivars of Pisum proposed by a variety of systematic studies. Intraspecific rankings were supported for northern P. humile, southern P. humile, P. elatius and P. sativum, which together comprise a single-species complex. P. fulvum, while clearly the most divergent of the pea taxa, could also be assigned to the same species complex without violating the hierarchial logic of the cladogram. Its inclusion or exclusion depends on whether the level of interfertility it displays with other pea taxa or its overall morphological and chromosomal distinction are emphasized. As suggested by previous studies, northern P. humile was the most likely sister taxon to cultivated P. sativum; although, rigorous phylogenetic evaluation revealed a close genealogical affinity among P. elatius, northern P. humile and P. sativum. Despite their limited number, the 16 morphological characters and allozyme markers used precisely organized the pea taxa into established taxonomic groupings, perhaps in part reflecting the role morphology has played historically in pea classification. The RAPD data also generally supported these same groupings and provided additional information regarding the relationships among the taxa. Given that RAPDs are relatively quick and easy to use, are refractory to many environmental influences, can be generated in large numbers, and can complement traditional characters that may be limited in availability, they provide a valuable new resource for phylogenetic studies.  相似文献   

18.
Despite the recent surge of interest in studying the evolution of development, surprisingly little work has been done to investigate the phylogenetic signal in developmental characters. Yet, both the potential usefulness of developmental characters in phylogenetic reconstruction and the validity of inferences on the evolution of developmental characters depend on the presence of such a phylogenetic signal and on the ability of our coding scheme to capture it. In a recent study, we showed, using simulations, that a new method (called the continuous analysis) using standardized time or ontogenetic sequence data and squared-change parsimony outperformed event pairing and event cracking in analyzing developmental data on a reference phylogeny. Using the same simulated data, we demonstrate that all these coding methods (event pairing and standardized time or ontogenetic sequence data) can be used to produce phylogenetically informative data. Despite some dependence between characters (the position of an event in an ontogenetic sequence is not independent of the position of other events in the same sequence), parsimony analysis of such characters converges on the correct phylogeny as the amount of data increases. In this context, the new coding method (developed for the continuous analysis) outperforms event pairing; it recovers a lower proportion of incorrect clades. This study thus validates the use of ontogenetic data in phylogenetic inference and presents a simple coding scheme that can extract a reliable phylogenetic signal from these data.  相似文献   

19.
Phylogenetic networks generalize evolutionary trees, and are commonly used to represent evolutionary histories of species that undergo reticulate evolutionary processes such as hybridization, recombination and lateral gene transfer. Recently, there has been great interest in trying to develop methods to construct rooted phylogenetic networks from triplets, that is rooted trees on three species. However, although triplets determine or encode rooted phylogenetic trees, they do not in general encode rooted phylogenetic networks, which is a potential issue for any such method. Motivated by this fact, Huber and Moulton recently introduced trinets as a natural extension of rooted triplets to networks. In particular, they showed that $\text{ level-1 }$ phylogenetic networks are encoded by their trinets, and also conjectured that all “recoverable” rooted phylogenetic networks are encoded by their trinets. Here we prove that recoverable binary level-2 networks and binary tree-child networks are also encoded by their trinets. To do this we prove two decomposition theorems based on trinets which hold for all recoverable binary rooted phylogenetic networks. Our results provide some additional evidence in support of the conjecture that trinets encode all recoverable rooted phylogenetic networks, and could also lead to new approaches to construct phylogenetic networks from trinets.  相似文献   

20.
A central task in the study of molecular evolution is the reconstruction of a phylogenetic tree from sequences of current-day taxa. The most established approach to tree reconstruction is maximum likelihood (ML) analysis. Unfortunately, searching for the maximum likelihood phylogenetic tree is computationally prohibitive for large data sets. In this paper, we describe a new algorithm that uses Structural Expectation Maximization (EM) for learning maximum likelihood phylogenetic trees. This algorithm is similar to the standard EM method for edge-length estimation, except that during iterations of the Structural EM algorithm the topology is improved as well as the edge length. Our algorithm performs iterations of two steps. In the E-step, we use the current tree topology and edge lengths to compute expected sufficient statistics, which summarize the data. In the M-Step, we search for a topology that maximizes the likelihood with respect to these expected sufficient statistics. We show that searching for better topologies inside the M-step can be done efficiently, as opposed to standard methods for topology search. We prove that each iteration of this procedure increases the likelihood of the topology, and thus the procedure must converge. This convergence point, however, can be a suboptimal one. To escape from such "local optima," we further enhance our basic EM procedure by incorporating moves in the flavor of simulated annealing. We evaluate these new algorithms on both synthetic and real sequence data and show that for protein sequences even our basic algorithm finds more plausible trees than existing methods for searching maximum likelihood phylogenies. Furthermore, our algorithms are dramatically faster than such methods, enabling, for the first time, phylogenetic analysis of large protein data sets in the maximum likelihood framework.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号