期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-DCM3 versus TNT 总被引：1，自引：0，他引：1

Goloboff PA Pol D 《Systematic biology》2007,56(3):485-495

Roshan et al. recently described a "divide-and-conquer" technique for parsimony analysis of large data sets, Rec-I-DCM3, and stated that it compares very favorably to results using the program TNT. Their technique is based on selecting subsets of taxa to create reduced data sets or subproblems, finding most-parsimonious trees for each reduced data set, recombining all parts together, and then performing global TBR swapping on the combined tree. Here, we contrast this approach to sectorial searches, a divide-and-conquer algorithm implemented in TNT. This algorithm also uses a guide tree to create subproblems, with the first-pass state sets of the nodes that join the selected sectors with the rest of the topology; this allows exact length calculations for the entire topology (that is, any solution N steps shorter than the original, for the reduced subproblem, must also be N steps shorter for the entire topology). We show here that, for sectors of similar size analyzed with the same search algorithms, subdividing data sets with sectorial searches produces better results than subdividing with Rec-I-DCM3. Roshan et al.'s claim that Rec-I-DCM3 outperforms the techniques in TNT was caused by a poor experimental design and algorithmic settings used for the runs in TNT. In particular, for finding trees at or very close to the minimum known length of the analyzed data sets, TNT clearly outperforms Rec-I-DCM3. Finally, we show that the performance of Rec-I-DCM3 is bound by the efficiency of TBR implementation for the complete data set, as this method behaves (after some number of iterations) as a technique for cyclic perturbations and improvements more than as a divide-and-conquer strategy. 相似文献

2.

Effects of data incompleteness on the relative performance of parsimony and Bayesian approaches in a supermatrix phylogenetic reconstruction of Mustelidae and Procyonidae (Carnivora)

Mieczyslaw Wolsan Jun J. Sato 《Cladistics : the international journal of the Willi Hennig Society》2010,26(2):168-194

Missing data are commonly thought to impede a resolved or accurate reconstruction of phylogenetic relationships, and probabilistic analysis techniques are increasingly viewed as less vulnerable to the negative effects of data incompleteness than parsimony analyses. We test both assumptions empirically by conducting parsimony and Bayesian analyses on an approximately 1.5 × 10⁶‐cell (27 965 characters × 52 species) mustelid–procyonid molecular supermatrix with 62.7% missing entries. Contrary to the first assumption, phylogenetic relationships inferred from our analyses are fully (Bayesian) or almost fully (parsimony) resolved topologically with mostly strong support and also largely in accord with prior molecular estimations of mustelid and procyonid phylogeny derived with parsimony, Bayesian, and other probabilistic analysis techniques from smaller but complete or nearly complete data sets. Contrary to the second assumption, we found no compelling evidence in support of a relationship between the inferior performance of parsimony and taxon incompleteness (i.e. the proportion of missing character data for a taxon), although we found evidence for a connection between the inferior performance of parsimony and character incompleteness (i.e. no overlap in character data between some taxa). The relatively good performance of our analyses may be related to the large number of sampled characters, so that most taxa (even highly incomplete ones) are represented by a sufficient number of characters allowing both approaches to resolve their relationships. © The Willi Hennig Society 2009. 相似文献

3.

TNT version 1.5, including a full implementation of phylogenetic morphometrics

下载免费PDF全文

Pablo A. Goloboff Santiago A. Catalano 《Cladistics : the international journal of the Willi Hennig Society》2016,32(3):221-238

Version 1.5 of the computer program TNT completely integrates landmark data into phylogenetic analysis. Landmark data consist of coordinates (in two or three dimensions) for the terminal taxa; TNT reconstructs shapes for the internal nodes such that the difference between ancestor and descendant shapes for all tree branches sums up to a minimum; this sum is used as tree score. Landmark data can be analysed alone or in combination with standard characters; all the applicable commands and options in TNT can be used transparently after reading a landmark data set. The program continues implementing all the types of analyses in former versions, including discrete and continuous characters (which can now be read at any scale, and automatically rescaled by TNT). Using algorithms described in this paper, searches for landmark data can be made tens to hundreds of times faster than it was possible before (from T to 3T times faster, where T is the number of taxa), thus making phylogenetic analysis of landmarks feasible even on standard personal computers. 相似文献

4.

Inferring complex phylogenies using parsimony: an empirical approach using three large DNA data sets for angiosperms

Soltis DE Soltis PS Mort ME Chase MW Savolainen V Hoot SB Morton CM 《Systematic biology》1998,47(1):32-42

To explore the feasibility of parsimony analysis for large data sets, we conducted heuristic parsimony searches and bootstrap analyses on separate and combined DNA data sets for 190 angiosperms and three outgroups. Separate data sets of 18S rDNA (1,855 bp), rbcL (1,428 bp), and atpB (1,450 bp) sequences were combined into a single matrix 4,733 bp in length. Analyses of the combined data set show great improvements in computer run times compared to those of the separate data sets and of the data sets combined in pairs. Six searches of the 18S rDNA + rbcL + atpB data set were conducted; in all cases TBR branch swapping was completed, generally within a few days. In contrast, TBR branch swapping was not completed for any of the three separate data sets, or for the pairwise combined data sets. These results illustrate that it is possible to conduct a thorough search of tree space with large data sets, given sufficient signal. In this case, and probably most others, sufficient signal for a large number of taxa can only be obtained by combining data sets. The combined data sets also have higher internal support for clades than the separate data sets, and more clades receive bootstrap support of > or = 50% in the combined analysis than in analyses of the separate data sets. These data suggest that one solution to the computational and analytical dilemmas posed by large data sets is the addition of nucleotides, as well as taxa. 相似文献

5.

A cladistic reconstruction of the ancestral mite harvestman (Arachnida,Opiliones, Cyphophthalmi): portrait of a Paleozoic detritivore

Gonzalo Giribet 《Cladistics : the international journal of the Willi Hennig Society》2012,28(6):582-597

相似文献

6.

A Report on "One Day Symposium on Numerical Cladistics"

Inés Horovitz 《Cladistics : the international journal of the Willi Hennig Society》1999,15(2):177-182

A recent symposium on numerical cladistics held at the American Museum of Natural History in New York City, addressed novel methods for searching tree space, applications of randomizations in cladistic analysis, and data management. One of the major concerns in systematics is that of finding the global optimum in tree length. The space to search is complex because it includes many local optima. It is a difficult task to escape local optima without a great loss in efficiency. The ideal is to search among suboptimal topologies and still obtain an answer in a reasonable amount of time. Nixon presented a new family of methods called "parsimony ratchet," which are successful at escaping local optima. Moilanen presented a new program which may have similar advantages. Two presentations, one by Goloboff and Farris and another by Farris, Goloboff, Källersjö, and Oxelman, introduced modifications to parsimony jackknifing that improved its accuracy when compared to normal heuristic searches. Wheeler discussed the advantages of new methods of analyzing DNA and protein sequence data, which eliminate multiple alignment; the most recent one packs nucleotides into strings which constitute the new characters. Siddall discussed different applications of randomization in cladistics and their logical consistency, finding some more acceptable than others. Nixon and Carpenter presented a new program for managing data. This symposium will probably be a landmark judging from the originality and practicality of the points presented. 相似文献

7.

First fossil Molinaranea Mello‐Leitão, 1940 (Araneae: Araneidae), from middle Miocene Dominican amber,with a phylogenetic and palaeobiogeographical analysis of the genus

ERIN E. SAUPE PAUL A. SELDEN DAVID PENNEY 《Zoological Journal of the Linnean Society》2010,158(4):711-725

The first fossil Molinaranea is described, from middle Miocene Dominican amber. This record extends the known range of the genus back 16 million years; it also extends the geographical range of the genus through time, with extant species known only from Chile, Argentina, the Falkland Islands, and Juan Fernandez Island. A parsimony‐based phylogenetic analysis was performed, which indicates that the fossil species, Molinaranea mitnickii sp. nov. , is nested with Molinaranea magellanica Walckenaer, 1847 and Molinaranea clymene Nicolet, 1849 . A modified Brooks parsimony analysis was conducted in order to examine the biogeography and origins of the fossil species in the Dominican Republic; the analysis suggests that M. mitnickii sp. nov. arrived in Hispaniola from South America as a result of a chance dispersal event. © 2010 The Linnean Society of London, Zoological Journal of the Linnean Society, 2010, 158 , 711–725. 相似文献

8.

Parsimony analysis of endemicity as a panbiogeographical tool: an analysis of Caribbean plant taxa

AMPARO ECHEVERRY JUAN J. MORRONE 《Biological journal of the Linnean Society. Linnean Society of London》2010,101(4):961-976

To demonstrate that parsimony analysis of endemicity (PAE) can be a method implementing the panbiogeographic approach, we analyzed two data matrices of 40/38 biogeographic provinces × 148 plant species from the Caribbean subregion of the Neotropical region, one where taxa are represented by individual tracks and the other where taxa are represented by single sample localities. We obtained six generalized tracks resulted from the PAE of the areas × individual tracks matrix, and one generalized track from the PAE of the areas × single sample localities matrix, with the latter nested within the former tracks. The results obtained show that PAE works as a panbiogeographical tool if it is based on an areas × individual tracks matrix. When performed in this way, PAE retrieves spatial information that is lost when it is based on an areas × single sample localities matrix, raising doubts regarding the conclusions derived from this latter type of analysis. © 2010 The Linnean Society of London, Biological Journal of the Linnean Society, 2010, 101 , 961–976. 相似文献

9.

A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony 总被引：24，自引：1，他引：23

Lake JA 《Molecular biology and evolution》1987,4(2):167-191

The method of evolutionary parsimony--or operator invariants--is a technique of nucleic acid sequence analysis related to parsimony analysis and explicitly designed for determining evolutionary relationships among four distantly related taxa. The method is independent of substitution rates because it is derived from consideration of the group properties of substitution operators rather than from an analysis of the probabilities of substitution in branches of a tree. In both parsimony and evolutionary parsimony, three patterns of nucleotide substitution are associated one-to-one with the three topologically linked trees for four taxa. In evolutionary parsimony, the three quantities are operator invariants. These invariants are the remnants of substitutions that have occurred in the interior branch of the tree and are analogous to the substitutions assigned to the central branch by parsimony. The two invariants associated with the incorrect trees must equal zero (statistically), whereas only the correct tree can have a nonzero invariant. The chi 2-test is used to ascertain the nonzero invariant and the statistically favored tree. Examples, obtained using data calculated with evolutionary rates and branchings designed to camouflage the true tree, show that the method accurately predicts the tree, even when substitution rates differ greatly in neighboring peripheral branches (conditions under which parsimony will consistently fail). As the number of substitutions in peripheral branches becomes fewer, the parsimony and the evolutionary-parsimony solutions converge. The method is robust and easy to use. 相似文献

10.

Fast Mapping of Short Sequences with Mismatches,Insertions and Deletions Using Index Structures

Steve Hoffmann Christian Otto Stefan Kurtz Cynthia M. Sharma Philipp Khaitovich J?rg Vogel Peter F. Stadler J?rg Hackermüller 《PLoS computational biology》2009,5(9)

相似文献

11.

Computer programs in nucleic acid synthesis: synthetic strategy development using solid-phase chemical techniques with data storage, retrieval and analysis capabilities.

下载免费PDF全文

S Lombardi H Seidell S Pulford W Dutton S Parekh 《Nucleic acids research》1984,12(5):2581-2591

A computer program has been designed to aid development of synthetic strategies for oligonucleotides produced by solid-phase chemical techniques. The program reduces the time required to develop a strategy and a data file from hours to minutes. The program contains inventories, provides cost analyses, and generates and stores other associated data. The program searches an inventory of sequences for that sequence to avoid duplicate synthesis. If the sequence is not in the inventory the program devises a synthetic strategy, calculates the amounts of reagents and labor costs necessary to complete the synthetic oligonucleotide. The program also deducts the reagents from inventory files. Physical data is also calculated. A file is generated in a sequence inventory for storage of the data as well as other data that will be generated during the purification processes. All variable parameters can be easily edited. The programs were designed to provide a cross-referencing feature for data analysis and can use several parameters as a constant. 相似文献

12.

Plethodontid salamander mitochondrial genomics: A parsimony evaluation of character conflict and implications for historical biogeography 总被引：1，自引：0，他引：1

J. Robert Macey 《Cladistics : the international journal of the Willi Hennig Society》2005,21(2):194-202

A new parsimony analysis of 27 complete mitochondrial genomic sequences is conducted to investigate the phylogenetic relationships of plethodontid salamanders. This analysis focuses on the amount of character conflict between phylogenetic trees recovered from newly conducted parsimony searches and the Bayesian and maximum likelihood topology reported by Mueller et al. (2004 ; PNAS, 101, 13820–13825). Strong support for Hemidactylium as the sister taxon to all other plethodontids is recovered from parsimony analyses. Plotting area relationships on the most parsimonious phylogenetic tree suggests that eastern North America is the origin of the family Plethodontidae supporting the “Out of Appalachia” hypothesis. A new taxonomy that recognizes clades recovered from phylogenetic analyses is proposed. © The Willi Hennig Society 2005. 相似文献

13.

CLASS2: accurate and efficient splice variant annotation from RNA-seq reads

Li Song Sarven Sabunciyan Liliana Florea 《Nucleic acids research》2016,44(10):e98

相似文献

14.

Linked‐read sequencing enables haplotype‐resolved resequencing at population scale

Dave Lutgen Raphael Ritter Remi‐Andr Olsen Holger Schielzeth Joel Gruselius Philip Ewels Jesús T. García Hadoram Shirihai Manuel Schweizer Alexander Suh Reto Burri 《Molecular ecology resources》2020,20(5):1311-1322

The feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences – including the quantification and dating of admixture, introgression and demographic events, and inference of selective sweeps – are still limited by the lack of high‐quality haplotype information. The newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype‐resolved genome resequencing at population scale, we investigated properties of linked‐read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25×, 20×, 15×, 10×, 7×, and 5×) with high‐coverage data (46–68×) of seven bird genomes mapped to a reference suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15× coverage, phased haplotypes span about 90% of the genome assembly, with 50% and 90% of phased sequences located in phase blocks longer than 1.25–4.6 Mb (N50) and 0.27–0.72 Mb (N90). Phasing accuracy reaches beyond 99% starting from 15× coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1 Mb [N50/N90] at 25× coverage), but only marginally improved phasing accuracy. Phase block contiguity improved with input DNA molecule length; thus, higher‐quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase‐sized genomes like birds, linked‐read sequencing at moderate depth opens an affordable avenue towards haplotype‐resolved genome resequencing at population scale. 相似文献

15.

A new method in computer-assisted imaging in neuroanatomy 总被引：2，自引：0，他引：2

D G von Keyserlingk K Niemann J Wasel J Reinold K Poeck 《Acta anatomica》1985,123(4):240-246

A procedure is described yielding computed images of postmortem brains with high topographic accuracy. Structures of the brain are traced and registered by means of a digitizer capable of measuring coordinates three-dimensionally. The information corresponding to one brain model is stored on a flexible disk with a capacity of 256 Kbytes. According to the output desired, the resulting brain images are either completely or partially displayed on the computer screen as stereo pairs. The brain models possess a local fidelity of about 1 mm. The images are useful in simultaneously studying superficial and central parts of the brain, spatial relationships of the various structures and the projection of deep structures onto the surface of the brain. A RAM of about 100 Kbytes is necessary for a program enabling the user to perform stereo projections, three-dimensional transformations and other image manipulations. The special features of anatomical computer imaging as compared to computed tomography (CT) and nuclear magnetic resonance imaging (NMR) are outlined. A combination of these different techniques seems to improve clinical diagnosis. 相似文献

16.

A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood 总被引：16，自引：0，他引：16

Guindon S Gascuel O 《Systematic biology》2003,52(5):696-704

The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/. 相似文献

17.

Phylogenetic algorithms and the evolution of species communities in forest fragments

Roseli Pellens Philippe Grandcolas Eric Guilbert 《Cladistics : the international journal of the Willi Hennig Society》2005,21(1):8-14

In forest fragmentation studies, low specific richness in small fragments and community nestedness are usually considered to result from species loss. However, except in the case of fragmentation experiments, these studies cannot distinguish between original low richness and secondary species loss, or between original high richness and secondary colonizations in fragments. To distinguish between these possibilities is a matter of historical inference for which phylogenetic algorithms are designed. The methods of phylogenetic analysis, and especially parsimony analysis, can be used to find a tree of relationships between communities from different forest fragments, taking the presence or absence of species among different communities as characters. Parsimony analysis searches if species subsets can be classified in a nested hierarchy, and also establishes how the communities evolved, polarizing species changes into either extinctions or colonizations. By re‐analyzing two classical studies in this new and powerful way, we demonstrate that the differences between fragments and large continuous forests cannot be attributed to species loss in all cases, contrary to expectations from models. © The Willi Hennig Society 2005. 相似文献

18.

Identification of the areas of endemism (AOEs) of the genus Acantholimon (Plumbaginaceae) in Iran

Farzaneh Khajoei Nasab Ahmad Reza Khosravi 《Plant biosystems》2020,154(5):726-736

Abstract

This study was conducted to identify areas of endemism for Acantholimon species using parsimony analysis of endemicity (PAE) and to detect endemic species richness of the genus in the region. The results obtained from the two methods used in this study were used in determining the priorities for the conservation of Acantholimon species in Iran. The distribution database of 62 endemic species belonging to this genus was formed by 1250 georeferenced observations in Iran. The study area was divided into 1?×1? grids of operative geographical units (OGUs) and the species?×?area matrix including presence/absence data was created. The endemic species richness was calculated using circular neighborhood with a radius of 50?km in 10?×?10?km² raster cells using DIVA-GIS software. The results of PAE analysis have shown four areas of endemism (AOEs) in Iran. AOE1: including Alborz and Zagros mountains, the mountains of central Iran. AOE2 and AOE3 are located in Khorassan subregion and AOE4 contains parts of western Iran. The map of endemic species richness indicated that the highest number of endemic species occurs in central Alborz region as well as Kerman, Chahar-Mahal and Bakhtiari, and Isfahan provinces. 相似文献

19.

ACOHAP: an efficient ant colony optimization for the haplotype inference by pure parsimony problem

Dong?Duc?Do Sy?Vinh?Le Email author Xuan?Huan?Hoang 《Swarm Intelligence》2013,7(1):63-77

Haplotype information plays an important role in many genetic analyses. However, the identification of haplotypes based on sequencing methods is both expensive and time consuming. Current sequencing methods are only efficient to determine conflated data of haplotypes, that is, genotypes. This raises the need to develop computational methods to infer haplotypes from genotypes.Haplotype inference by pure parsimony is an NP-hard problem and still remains a challenging task in bioinformatics. In this paper, we propose an efficient ant colony optimization (ACO) heuristic method, named ACOHAP, to solve the problem. The main idea is based on the construction of a binary tree structure through which ants can travel and resolve conflated data of all haplotypes from site to site. Experiments with both small and large data sets show that ACOHAP outperforms other state-of-the-art heuristic methods. ACOHAP is as good as the currently best exact method, RPoly, on small data sets. However, it is much better than RPoly on large data sets. These results demonstrate the efficiency of the ACOHAP algorithm to solve the haplotype inference by pure parsimony problem for both small and large data sets. 相似文献

20.

WinGene/WinPep: user-friendly software for the analysis of amino acid sequences.

L Hennig 《BioTechniques》1999,26(6):1170-1172

WinGene1.0/WinPep1.2 is a pair of Microsoft Windows programs designed to read nucleotide or amino acid sequence data. These versatile programs have the following capabilities: (i) searches for open reading frames and their translation, (ii) assisting the design of primers for PCR and (iii) calculation of molecular weight, isoelectric point and molar absorbtion coefficients of polypeptides. Furthermore, hydropathic plots and helical wheel displays are easily produced. The programs run with an intuitive Windows interface, contain a comprehensive help file and enable data exchange with other applications by means of the Copy&Paste command. The software is free for academic and noncommercial users. 相似文献