首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The future of phylogeny reconstruction   总被引:1,自引:0,他引:1  
A new approach to phylogenetic analysis, parsimony jackknifing, uses simple parsimony calculations combined with resampling of characters to arrive at a tree comprising well-supported groups. This is usually much the same as the consensus of most-parsimonious trees found from extensive multiple-tree calculations, but the new method is thousands of times faster, allowing analysis of much larger data matrices, and also provides information on the strength of support for different groups. Jackknife frequencies provide a more reliable assessment of support than do alternative methods, notably "confidence probability" (CP) and T-PTP testing.  相似文献   

2.
The ever-larger data matrices resulting from continuing improvements in DNA sequencing techniques require faster and more efficient methods of phylogenetic analysis. Here we explore a promising new method, parsimony jackknifing, by analyzing a matrix comprising 2538 sequences of the chloroplast generbcL. The sequences included cover a broad taxonomic range, from cyanobacteria to flowering plants. Several parsimony jackknife analyses were performed, both with and without branch-swapping and multiple random addition sequences: 1) including all positions; 2) including only first and second codon positions; 3) including only third positions; and 4) using only transversions. The best resolution was obtained using all positions. Removal of third positions or transitions led to massive loss of resolution, although using only transversions somewhat improved basal resolution. While branch-swapping improved both resolution and the support found for several groups, most of the groups could be recovered by faster simple analyses. Designed to eliminate groups poorly supported by the data, parsimony jackknifing recognizes 1400 groups on the basis of allrbcL positions. These include major taxa such as green plants, land plants, flowering plants, monocots and eudicots. We include appendices of supported angiosperm families, as well as larger groups.  相似文献   

3.
Analyzing Large Data Sets in Reasonable Times: Solutions for Composite Optima   总被引:19,自引:3,他引:16  
New methods for parsimony analysis of large data sets are presented. The new methods are sectorial searches, tree-drifting, and tree-fusing. For Chase et al. 's 500-taxon data set these methods (on a 266-MHz Pentium II) find a shortest tree in less than 10 min (i.e., over 15,000 times faster than PAUP and 1000 times faster than PAUP*). Making a complete parsimony analysis requires hitting minimum length several times independently, but not necessarily all "islands" for Chase et al. 's data set, this can be done in 4 to 6 h. The new methods also perform well in other cases analyzed (which range from 170 to 854 taxa).  相似文献   

4.
The phylogenetic relationships of the African lungfish (Protopterus dolloi) and the coelacanth (Latimeria chalumnae) with respect to tetrapods were analyzed using complete mitochondrial genome DNA sequences. A lungfish + coelancanth clade was favored by maximum parsimony (although this result is dependent on which transition:transversion weights are applied), and a lungfish + tetrapod clade was supported by neighbor-joining and maximum-likelihood analyses. These two hypotheses received the strongest statistical and bootstrap support to the exclusion of the third alternative, the coelacanth + tetrapod sister group relationship. All mitochondrial protein coding genes combined favor a lungfish + tetrapod grouping. We can confidently reject the hypothesis that the coelacanth is the closest living relative of tetrapods. When the complete mitochondrial sequence data were combined with nuclear 28S rRNA gene data, a lungfish + coelacanth clade was supported by maximum parsimony and maximum likelihood, but a lungfish + tetrapod clade was favored by neighbor-joining. The seeming conflicting results based on different data sets and phylogenetic methods were typically not statistically strongly supported based on Kishino-Hasegawa and Templeton tests, although they were often supported by strong bootstrap values. Differences in rate of evolution of the different mitochondrial genes (slowly evolving genes such as the cytochrome oxidase and tRNA genes favored a lungfish + coelacanth clade, whereas genes of relatively faster substitution rate, such as several NADH dehydrogenase genes, supported a lungfish + tetrapod grouping), as well as the rapid radiation of the lineages back in the Devonian, rather than base compositional biases among taxa seem to be directly responsible for the remaining uncertainty in accepting one of the two alternate hypotheses.  相似文献   

5.
In Colless’ (1995,Syst. Biol. 44, 102–108) results, cladograms for randomly generated matrices were strongly asymmetrical, and he used this to maintain that real cladograms provide little evidence on asymmetry of phylogeny. His position, however, depended on retaining poorly supported groups as if they were well-supported. If poorly supported groups are removed, as with parsimony jackknifing, well-structured real data can still give strong asymmetry, while random matrices simply yield unresolved trees, obviating Colless’ argument.  相似文献   

6.
Because it is based on a significance test that takes the shape of the tree as given, the Rzhetsky/Nei Confidence Probability (CP) can attribute high "confidence" to groups with little or even literally no support. CP further overestimates confidence in that it takes no account of reliability of alignment, and it shows instability in that drastic changes in results can be produced by small changes in data. Instability can arise when alignment is uncertain, since different alignment strategies can lead to slightly different matrices. Parsimony jackknifing offers a more reliable and stable way of assessing support. To take ambiguities of alignment into account with parsimony jackknifing, we suggest "consensus" and "average" methods of summarizing jackknife results from several alignments. Reanalyzing 12S and 16S rRNA data on pelecaniform birds, we find that CP has overestimated support for the Ciconiida, for placing frigatebirds with condors, and for placing tropicbirds with cormorants.  相似文献   

7.
Haplotypes are an important resource for a large number of applications in human genetics, but computationally inferred haplotypes are subject to switch errors that decrease their utility. The accuracy of computationally inferred haplotypes increases with sample size, and although ever larger genotypic data sets are being generated, the fact that existing methods require substantial computational resources limits their applicability to data sets containing tens or hundreds of thousands of samples. Here, we present HAPI-UR (haplotype inference for unrelated samples), an algorithm that is designed to handle unrelated and/or trio and duo family data, that has accuracy comparable to or greater than existing methods, and that is computationally efficient and can be applied to 100,000 samples or more. We use HAPI-UR to phase a data set with 58,207 samples and show that it achieves practical runtime and that switch errors decrease with sample size even with the use of samples from multiple ethnicities. Using a data set with 16,353 samples, we compare HAPI-UR to Beagle, MaCH, IMPUTE2, and SHAPEIT and show that HAPI-UR runs 18× faster than all methods and has a lower switch-error rate than do other methods except for Beagle; with the use of consensus phasing, running HAPI-UR three times gives a slightly lower switch-error rate than Beagle does and is more than six times faster. We demonstrate results similar to those from Beagle on another data set with a higher marker density. Lastly, we show that HAPI-UR has better runtime scaling properties than does Beagle so that for larger data sets, HAPI-UR will be practical and will have an even larger runtime advantage. HAPI-UR is available online (see Web Resources).  相似文献   

8.
The genusLecanactis, with 24 species, has been phylogenetically analysed using cladistic parsimony methods and support tests. Morphological, anatomical and chemical data were used, comprising 38 characters. Twelve equally most parsimonious trees were obtained. The successive approximations character weighting method gave one most parsimonious tree. The ingroup,Lecanactis, is supported as monophyletic. Although parsimony jackknifing and Bremer support indicate that the trees are poorly supported, some groups are wholly or partly distinguished in both the strict consensus tree, the successive weighting tree and the Jac tree.  相似文献   

9.
Oblong, a program with very low memory requirements, is presented. It is designed for parsimony analysis of data sets comprising many characters for moderate numbers of taxa (the order of up to a few hundred). The program can avoid using vast amounts of RAM by temporarily saving data to disk buffers, only parts of which are periodically read back in by the program. In this way, the entire data set is never held in RAM by the program—only small parts of it. While using disk files to store the data slows down searches, it does so only by a relatively small factor (4× to 5×), because the program minimizes the number of times the data must be accessed (i.e. read back in) during tree searches. Thus, even if the program is not designed primarily for speed, runtimes are within an order of magnitude of those of the fastest existing parsimony programs.  相似文献   

10.
Even when the maximum likelihood (ML) tree is a better estimate of the true phylogenetic tree than those produced by other methods, the result of a poor ML search may be no better than that of a more thorough search under some faster criterion. The ability to find the globally optimal ML tree is therefore important. Here, I compare a range of heuristic search strategies (and their associated computer programs) in terms of their success at locating the ML tree for 20 empirical data sets with 14 to 158 sequences and 411 to 120,762 aligned nucleotides. Three distinct topics are discussed: the success of the search strategies in relation to certain features of the data, the generation of starting trees for the search, and the exploration of multiple islands of trees. As a starting tree, there was little difference among the neighbor-joining tree based on absolute differences (including the BioNJ tree), the stepwise-addition parsimony tree (with or without nearest-neighbor-interchange (NNI) branch swapping), and the stepwise-addition ML tree. The latter produced the best ML score on average but was orders of magnitude slower than the alternatives. The BioNJ tree was second best on average. As search strategies, star decomposition and quartet puzzling were the slowest and produced the worst ML scores. The DPRml, IQPNNI, MultiPhyl, PhyML, PhyNav, and TreeFinder programs with default options produced qualitatively similar results, each locating a single tree that tended to be in an NNI suboptimum (rather than the global optimum) when the data set had low phylogenetic information. For such data sets, there were multiple tree islands with very similar ML scores. The likelihood surface only became relatively simple for data sets that contained approximately 500 aligned nucleotides for 50 sequences and 3,000 nucleotides for 100 sequences. The RAxML and GARLI programs allowed multiple islands to be explored easily, but both programs also tended to find NNI suboptima. A newly developed version of the likelihood ratchet using PAUP* successfully found the peaks of multiple islands, but its speed needs to be improved.  相似文献   

11.
In response to comments by J. S. Farris (2000, Cladistics 16, 403–410) on the strongest evidence (SE) approach to phylogenetic analysis, I examine the concepts on which it is founded and reevaluate its merits. SE's null model of signal absence in characters is not treated as background knowledge, but as a reference point for evaluating a data set's phylogenetic signal in a tree-specific manner. In simulation tests, the SE methods perform reasonably well; although parsimony is generally more accurate and less biased than SE, SE is distinctly more accurate in some circumstances. Simulations further indicate that jackknifing is often beneficial in both SE and parsimony analyses. Iterative fixation of splits shows promise as an auxiliary procedure for SE and other methods that weight according to apparent homoplasy.  相似文献   

12.
This study describes novel algorithms for searching for most parsimonious trees. These algorithms are implemented as a parsimony computer program, PARSIGAL, which performs well even with difficult data sets. For high level search, PARSIGAL uses an evolutionary optimization algorithm, which feeds good tree candidates to a branch-swapping local search procedure. This study also describes an extremely fast method of recomputing state sets for binary characters (additive or nonadditive characters with two states), based on packing 32 characters into a single memory word and recomputing the tree simultaneously for all 32 characters using fast bitwise logical operations. The operational principles of PARSIGAL are quite different from those previously published for other parsimony computer programs. Hence it is conceivable that PARSIGAL may be able to locate islands of trees that are different from those that are easily located with existing parsimony computer programs.  相似文献   

13.
We have developed a novel cost-effective procedure, namely ‘chemical nanoprinting’, for oligonucleotide or cDNA chips manufacture. In this thermo-controlled process, the oligonucleotides, covalently attached to a highly loaded ‘master-chip’ through disulfide bonds, are chemically transferred to the acrylamide layer mounted on a ‘print-chip’. It is demonstrated here that multiple identical print-chips can be produced from a single master-chip. This duplication process is a few hundreds of times faster than any existing methods and the speed of process and cost incurred are independent of the scale of the DNA chips.  相似文献   

14.
A new consensus method for summarizing competing phylogenetic hypotheses, weighted compromise, is described. The method corrects for a bias inherent in majority‐rule consensus/compromise trees when the source trees exhibit non‐independence due to ambiguity in terminal clades. Suggestions are given for its employment in parsimony analyses and tree resampling strategies such as bootstrapping and jackknifing. An R function is described that can be used with the programming language R to produce the consensus.  相似文献   

15.
New algorithms for calculating the most parsimonious state sets for polytomies under Fitch parsimony are described. Because they are based on state set operations, these algorithms can be extended for optimization of several characters in parallel, thus increasing speed by a significant factor. This speed increase may facilitate analysis of molecular data sets, many of which contain hundreds of taxa, thousands of multistate nonadditive characters, and numerous polytomies.  相似文献   

16.
The maximum likelihood (ML) method of phylogenetic tree construction is not as widely used as other tree construction methods (e.g., parsimony, neighbor-joining) because of the prohibitive amount of time required to find the ML tree when the number of sequences under consideration is large. To overcome this difficulty, we propose a stochastic search strategy for estimation of the ML tree that is based on a simulated annealing algorithm. The algorithm works by moving through tree space by way of a "local rearrangement" strategy so that topologies that improve the likelihood are always accepted, whereas those that decrease the likelihood are accepted with a probability that is related to the proportionate decrease in likelihood. Besides greatly reducing the time required to estimate the ML tree, the stochastic search strategy is less likely to become trapped in local optima than are existing algorithms for ML tree estimation. We demonstrate the success of the modified simulated annealing algorithm by comparing it with two existing algorithms (Swofford's PAUP* and Felsenstein's DNAMLK) for several theoretical and real data examples.  相似文献   

17.
It is widely accepted that mitochondrial DNA (mtDNA) control region evolves faster than protein encoding genes with few exceptions. In the present study, we sequenced the mitochondrial cytochrome b gene (cyt b) and control region (CR) and compared their rates in 93 specimens representing 67 species of loaches and some related taxa in the Cobitoidea (Order Cypriniformes). The results showed that sequence divergences of the CR were broadly higher than those of the cyt b (about 1.83 times). However, in considering only closely related species, CR sequence evolution was slower than that of cyt b gene (ratio of CR/cyt b is 0.78), a pattern that is found to be very common in Cypriniformes. Combined data of the cyt b and CR were used to estimate the phylogenetic relationship of the Cobitoidea by maximum parsimony, neighbor-joining, and Bayesian methods. With Cyprinus carpio and Danio rerio as outgroups, three analyses identified the same four lineages representing four subfamilies of loaches, with Botiinae on the basal-most clade. The phylogenetic relationship of the Cobitoidea was ((Catostomidae+Gyrinocheilidae)+(Botiinae+(Balitorinae+(Cobitinae+Nemacheilinae)))), which indicated that Sawada's Cobitidae (including Cobitinae and Botiinae) was not monophyletic. Our molecular phylogenetic analyses are in very close agreement with the phylogenetic results based on the morphological data proposed by Nalbant and Bianco, wherein these four subfamilies were elevated to the family level as Botiidae, Balitoridae, Cobitidae, and Nemacheilidae.  相似文献   

18.
The weta Hemideina crassidens has two chromosomal races that differ by two centric fusions or fissions. The mitochondrial DNA of weta from both chromosomal races and a sister species were sequenced for a 750-bp region of the gene coding for cytochrome oxidase I. The average pairwise genetic distance among the 15 (XO)-chromosome race weta was almost four times greater than the average distance among the 19 (XO)-chromosome race weta. The weta from the 19-chromosome race formed a well-supported monophyletic clade in all shortest maximum parsimony trees. Maximum likelihood and neighbor-joining trees suggested that the 15-chromosome karyotype was paraphyletic with respect to the 19-chromosome karyotype, but this was not supported by maximum parsimony analyses. Although phylogenetic analysis could not exclude chromosome fusion as the rearrangement responsible for the karyotype differentiation, the level of sequence variation and pattern of distribution appear to implicate fission as the more likely event.  相似文献   

19.
The phylogenetic relationships of the members of the phylum Sipuncula are investigated by means of DNA sequence data from three nuclear markers, two ribosomal genes (18S rRNA and the D3 expansion fragment of 28S rRNA), and one protein-coding gene, histone H3. Phylogenetic analysis via direct optimization of DNA sequence data using parsimony as optimality criterion is executed for 12 combinations of parameter sets accounting for different indel costs and transversion/transition cost ratios in a sensitivity analysis framework. Alternative outgroup analyses are also performed to test whether they affected rooting of the sipunculan topology. Nodal support is measured by parsimony jackknifing and Bremer support values. Results from the different partitions are highly congruent, and the combined analysis for the parameter set that minimizes overall incongruence supports monophyly of Sipuncula, but nonmonophyly of several higher taxa recognized for the phylum. Mostly responsible for this is the split of the family Sipunculidae in three main lineages, with the genus Sipunculus being the sister group to the remaining sipunculans, the genus Phascolopsis nesting within the Golfingiiformes, and the genus Siphonosoma being associated to the Phascolosomatidea. Other interesting results are the position of Phascolion within Golfingiidae and the position of Antillesoma within Aspidosiphonidae. These results are not affected by the loci selected or by the outgroup chosen. The position of Apionsoma is discussed, although more data would be needed to better ascertain its phylogenetic affinities. Monophyly of the genera with multiple representatives (Themiste, Aspidosiphon, and Phascolosoma) is well supported, but not the monophyly of the genera Nephasoma or Golfingia. Interesting phylogeographic questions arise from analysis of multiple representatives of a few species.  相似文献   

20.

Background  

In phylogenetic analysis we face the problem that several subclade topologies are known or easily inferred and well supported by bootstrap analysis, but basal branching patterns cannot be unambiguously estimated by the usual methods (maximum parsimony (MP), neighbor-joining (NJ), or maximum likelihood (ML)), nor are they well supported. We represent each subclade by a sequence profile and estimate evolutionary distances between profiles to obtain a matrix of distances between subclades.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号