首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 468 毫秒
1.
Many investigators categorize individuals from hybrid zones to facilitate comparisons among genotypic classes (e.g., parental, F1, backcross) for comparative studies in which components of fitness or geographic variation are being analyzed. Frequently, multiple character sets representing genetically independent traits are used to classify these individuals and various methodologies are employed to combine the classifications obtained from the different character sets. We adapted the principles of total evidence and taxonomic congruence (two formalized approaches used by systematists in formulating phylogenetic hypotheses) to address the problem of discriminating hybridizing species and classifying individuals from hybrid zones. As our model, we used two morphological (coloration and morphometric) and two molecular (allozyme and mitochondrial DNA restriction-fragment-length polymorphism) character sets that differentiate two stone crab species (Menippe adina and M. mercenaria). Using principal-components analysis, we determined that combining character sets and eliminating characters or character sets that did not have large eigenvector coefficients for the principal component that best separated the two species yielded the highest level of discrimination between species and allowed us to classify a broad range of morpho-genotypes as hybrids. For the stone crabs, three diagnostic allozyme loci and five diagnostic coloration characters best separated the species. The two character sets were not completely congruent, but they agreed in their classification of 50% of the individuals from the hybrid zone and rarely strongly disagreed in their classifications. Classification discrepancies between the two character sets probably represent variation between traits in interspecific gene flow rather than intraspecific, ecologically mediated variation. Our results support the assertions of previous investigators who espoused the benefits associated with using multiple character sets to classify individuals from hybrid zones and demonstrate that, if character sets are reasonably congruent and numerically balanced, combining diagnostic characters from multiple character sets (a total-evidence approach) can enhance discriminatory power between species and facilitate the assignment of hybrid-zone individuals to genotypic classes. On the contrary, classifying hybrid-zone individuals using character sets separately (a taxonomic-congruence approach) provides the opportunity to compare levels of introgression between species and to assess reasons for discordance among the data sets.  相似文献   

2.
Partitioned Bremer support (PBS) is a valuable means of assessing congruence in combined data sets, but some aspects require clarification. When more than one equally parsimonious tree is found during the constrained search for trees lacking the node of interest, averaging PBS for each data set across these trees can conceal conflict, and PBS should ideally be examined for each constrained tree. Similarly, when multiple most parsimonious trees (MPTs) are generated during analysis of the combined data, PBS is usually calculated on the consensus tree. However, extra information can be obtained if PBS is calculated on each of the MPTs or even suboptimal trees.  相似文献   

3.
Mitochondrial cytochrome b sequence data from 15 species of herons (Aves: Ardeidae), representing 13 genera, were compared with DNA hybridization data of single-copy nuclear DNA (scnDNA) from the same species in a taxonomic congruence assessment of heron phylogeny. The two data sets produced a partially resolved, completely congruent estimate of phylogeny with the following basic structure: (Tigrisoma, Cochlearius, (((Zebrilus, (Ixobrychus, Botaurus)), (((Ardea, Casmerodius), Bubulcus), ((Egretta thula, Egretta caerulea, Egretta tricolor), Syrigma), Butorides, Nycticorax, Nyctanassa)))). Because congruence indicated similar phylogenetic information in the two data sets, we used the relatively unsaturated DNA hybridization distances as surrogates of time to examine graphically the patterns and rates of change in cytochrome b distances. Cytochrome b distances were computed either from whole sequences or from partitioned sequences consisting of transitions, transversions, specific codon site positions, or specific protein-coding regions. These graphical comparisons indicated that unpartitioned cytochrome b has evolved at 5-10 times the rate of scnDNA. Third-position transversions appeared to offer the most useful sequence partition for phylogenetic analysis because of their relatively fast rate of substitution (two times that of scnDNA) and negligible saturation. We also examined lineage-based rates of evolution by comparing branch length patterns between the nuclear and cytochrome b trees. The degree of correlation in corresponding branch lengths between cytochrome b and DNA hybridization trees depended on DNA sequence partitioning. When cytochrome b sequences were not partitioned, branch lengths in the cytochrome b and DNA hybridization trees were not correlated. However, when cytochrome b sequences were reduced to third-position transversions (i.e., unsaturated, relatively fast changing data), branch lengths were correlated. This finding suggests that lineage-based rates of DNA evolution in nuclear and mitochondrial genomes are influenced by common causes.  相似文献   

4.
Comparative phylogeography has emerged as a means of understanding the spatial patterns of genetic divergence of codistributed species. However, researchers are often frustrated because of the lack of appropriate statistical tests to assess concordancy of multiple phylogeographic trees. We develop a method for testing congruence across multiple species and synthesizing the data into a regional supertree. Nine phylogeographic data sets of species with different life histories and ecologies were statistically compared using maximum agreement subtrees (MAST) and showed a high degree of concordancy. A supertree combining the different phylogeographic trees was then computed using matrix representation with parsimony, and the groups defined by this supertree were tested against climatic data to investigate a potential mechanism driving divergence. Our data suggest that species and genetic lineages in California are shaped by climatic regimes. The supertree method in combination with MAST represents a new approach to test congruence hypotheses and detect common geographic signals in comparative phylogeography.  相似文献   

5.
In phylogenetic analyses with combined multigene or multiprotein data sets, accounting for differing evolutionary dynamics at different loci is essential for accurate tree prediction. Existing maximum likelihood (ML) and Bayesian approaches are computationally intensive. We present an alternative approach that is orders of magnitude faster. The method, Distance Rates (DistR), estimates rates based upon distances derived from gene/protein sequence data. Simulation studies indicate that this technique is accurate compared with other methods and robust to missing sequence data. The DistR method was applied to a fungal mitochondrial data set, and the rate estimates compared well to those obtained using existing ML and Bayesian approaches. Inclusion of the protein rates estimated from the DistR method into the ML calculation of trees as a branch length multiplier resulted in a significantly improved fit as measured by the Akaike Information Criterion (AIC). Furthermore, bootstrap support for the ML topology was significantly greater when protein rates were used, and some evident errors in the concatenated ML tree topology (i.e., without protein rates) were corrected. [Bayesian credible intervals; DistR method; multigene phylogeny; PHYML; rate heterogeneity.].  相似文献   

6.
The bootstrap is an important tool for estimating the confidence interval of monophyletic groups within phylogenies. Although bootstrap analyses are used in most evolutionary studies, there is no clear consensus as how best to interpret bootstrap probability values. To study further the bootstrap method, nine small subunit ribosomal DNA (SSU rDNA) data sets were submitted to bootstrapped maximum parsimony (MP) analyses using unweighted and weighted sequence positions. Analyses of the lengths (i.e., parsimony steps) of the bootstrap trees show that the shape and mean of the bootstrap tree distribution may provide important insights into the evolutionary signal within the sequence data. With complex phylogenies containing nodes defined by short internal branches (multifurcations), the mean of the bootstrap tree distribution may differ by 2 standard deviations from the length of the best tree found from the original data set. Weighting sequence positions significantly increases the bootstrap values at internal nodes. There may, however, be strong bootstrap support for conflicting species groupings among different data sets. This phenomenon appears to result from a correlation between the topology of the tree used to create the weights and the topology of the bootstrap consensus tree inferred from the MP analysis of these weighted data. The analyses also show that characteristics of the bootstrap tree distribution (e.g., skewness) may be used to choose between alternative weighting schemes for phylogenetic analyses.  相似文献   

7.
Whole-genome duplication (WGD) produces sets of gene pairs that are all of the same age. We therefore expect that phylogenetic trees that relate these pairs to their orthologs in other species should show a single consistent topology. However, a previous study of gene pairs formed by WGD in the yeast Saccharomyces cerevisiae found conflicting topologies among neighbor-joining (NJ) trees drawn from different loci and suggested that this conflict was the result of "asynchronous functional divergence" of duplicated genes (Langkjaer, R. B., P. F. Cliften, M. Johnston, and J. Piskur. 2003. Yeast genome duplication was followed by asynchronous differentiation of duplicated genes. Nature 421:848-852). Here, we test whether the conflicting topologies might instead be due to asymmetrical rates of evolution leading to long-branch attraction (LBA) artifacts in phylogenetic trees. We constructed trees for 433 pairs of WGD paralogs in S. cerevisiae with their single orthologs in Saccharomyces kluyveri and Candida albicans. We find a strong correlation between the asymmetry of evolutionary rates of a pair of S. cerevisiae paralogs and the topology of the tree inferred for that pair. Saccharomyces cerevisiae gene pairs with approximately equal rates of evolution tend to give phylogenies in which the WGD postdates the speciation between S. cerevisiae and S. kluyveri (B-trees), whereas trees drawn from gene pairs with asymmetrical rates tend to show WGD pre-dating this speciation (A-trees). Gene order data from throughout the genome indicate that the "A-trees" are artifacts, even though more than 50% of gene pairs are inferred to have this topology when the NJ method as implemented in ClustalW (i.e., with Poisson correction of distances) is used to construct the trees. This LBA artifact can be ameliorated, but not eliminated, by using gamma-corrected distances or by using maximum likelihood trees with robustness estimated by the Shimodaira-Hasegawa test. Tests for adaptive evolution indicated that positive selection might be the cause of rate asymmetry in a substantial fraction (19%) of the paralog pairs.  相似文献   

8.
We compared four approaches for analyzing three data sets derived from staphylinoid beetles, a superfamily whose known species diversity is roughly comparable to that of vertebrates. One data set is derived from adult morphology and the two molecular data sets are from 12S ribosomal RNA and cytochrome b mitochondrial DNA. We found that taxonomic congruence following conditional data combination, herein called compatible evidence (CE), resolved more nodes compatible with an initial conservative hypothesis than did total evidence (TE), conditional data combination (CDC), or taxonomic congruence (TC). CE sets a base of nodes obtained by CDC analysis and then investigates what further agreement may arise in a universe where these nodes are accepted as given. We suggest that CE75-75 may be appropriate for future studies that aim to both generate a well-corroborated tree and investigate conflicts between data sets, partitions, and characters. CE75-75 is a 75% bootstrap consensus CDC tree followed by combinable-component consensus of a 75% bootstrap consensus of each homogeneous set of partitions having hierarchical structure.  相似文献   

9.
Direct optimization of unaligned sequence characters provides a natural framework to explore the sensitivity of phylogenetic hypotheses to variation in analytical parameters. Phenotypic data, when combined into such analyses, are typically analyzed with static homology correspondences unlike the dynamic homology sequence data. Static homology characters may be expected to constrain the direct optimization and thus, potentially increase the similarity of phylogenetic hypotheses under different cost sets. However, whether a total-evidence approach increases the phylogenetic stability or not remains empirically largely unexplored. Here, I studied the impact of static homology data on sensitivity using six empirical data sets composed of several molecular markers and phenotypic data. The inclusion of static homology phenotypic data increased the average stability of phylogenetic hypothesis in five out of the six data sets. To investigate if any static homology characters would have similar effect, the analyses were repeated with randomized phenotypic data, and with one of the molecular markers fixed as static homology characters. These analyses had, on average, almost no effect on the phylogenetic stability, although the randomized phenotypic data sometimes resulted in even higher stability than empirical phenotypic data. The impact was related to the strength of the phylogenetic signal in the phenotypic data: higher average jackknife support of the phenotypic tree correlated with stronger stabilizing effect in the total-evidence analysis. Phenotypic data with a strong signal made the total-evidence trees topologically more similar to the phenotypic trees, thus, they constrained the dynamic homology correspondences of the sequence data. Characters that increase phylogenetic stability are particularly valuable for phylogenetic inference. These results indicate an important role and additive value of phenotypic data in increasing the stability of phylogenetic hypotheses in total-evidence analyses.  相似文献   

10.
This paper examines a recent proposal to calculate supertrees by minimizing the sum of subtree prune‐and‐regraft distances to the input trees. The supertrees thus calculated may display groups present in a minority of the input trees but contradicted by the majority, or groups that are not supported by any input tree or combination of input trees. The proponents of the method themselves stated that these are serious problems of “matrix representation with parsimony”, but they can in fact occur in their own method. The majority rule supertrees, being explicitly clade‐based, cannot have these problems, and seem much more suited to retrieving common clades from a set of trees with different taxon sets. However, it is dubious that so‐called majority rule supertrees can always be interpreted as displaying those clades present (or compatible with) with a majority of the trees. The majority rule consensus is always a median tree, in terms of the Robinson–Foulds distances (i.e. it minimizes the sum of Robinson–Foulds distances to the input trees). In contrast, majority rule supertrees may not be median—different, contradictory trees may minimize Robinson–Foulds distances, while their strict consensus does not. If being “majority” results from being median in Robinson–Foulds distances, this means that in the supertree setting a “majority” is ambiguously defined, sometimes achievable only by mutually contradictory trees.  相似文献   

11.
Cophylogeny is the congruence of phylogenetic relationships between two different groups of organisms due to their long‐term interaction. We investigated the use of tree shape distance measures to quantify the degree of cophylogeny. We implemented a reverse‐time simulation model of pathogen phylogenies within a fixed host tree, given cospeciation probability, host switching, and pathogen speciation rates. We used this model to evaluate 18 distance measures between host and pathogen trees including two kernel distances that we developed for labeled and unlabeled trees, which use branch lengths and accommodate different size trees. Finally, we used these measures to revisit published cophylogenetic studies, where authors described the observed associations as representing a high or low degree of cophylogeny. Our simulations demonstrated that some measures are more informative than others with respect to specific coevolution parameters especially when these did not assume extreme values. For real datasets, trees’ associations projection revealed clustering of high concordance studies suggesting that investigators are describing it in a consistent way. Our results support the hypothesis that measures can be useful for quantifying cophylogeny. This motivates their usage in the field of coevolution and supports the development of simulation‐based methods, i.e., approximate Bayesian computation, to estimate the underlying coevolutionary parameters.  相似文献   

12.
A phylogenetic analysis of mitochondrial and nuclear rDNA sequences from species of all the superfamilies of the insect order Orthoptera (grasshoppers, crickets, and relatives) confirmed that although mitochondrial sequences provided good resolution of the youngest superfamilies, nuclear rDNA sequences were necessary to separate the basal groups. To try to reconcile these data sets into a single, fully resolved orthopteran phylogeny, we adopted consensus and combined data strategies. The consensus analysis produced a partially resolved tree that lacked several well-supported features of the individual analyses. However, this lack of resolution was explained by an examination of resampled data sets, which identified the likely source of error as the relatively short length of the individual mitochondrial data partitions. In a subsequent comparison in which the mitochondrial sequences were initially combined, we observed less conflict. We then used two approaches to examine the validity of combining all of the data in a single analysis: comparative analysis of trees recovered from resampled data sets, and the application of a randomization test. Because the results did not point to significant levels of heterogeneity in phylogenetic signal between the mitochondrial and nuclear data sets, we therefore proceeded with a combined analysis. Reconstructing phylogenies under the minimum evolution and maximum likelihood optimality criteria, we examined monophyly of the major orthopteran groups, using nonparametric and parametric bootstrap analysis and Kishino-Hasegawa tests. Our analysis suggests that phylogeny reconstruction under the maximum likelihood criteria is the most discriminating approach for the combined sequences. The results indicate, moreover, that the caeliferan Pneumoroidea and Pamphagoidea, as previously suggested, are polyphyletic. The Acridoidea is redefined to include all pamphagoid families other than the Pyrgomorphidae, which we propose should be accorded superfamily status.  相似文献   

13.
Although some recent morphological and molecular studies agree that Cetacea is closely related to Hippopotamidae, there is little consensus on the phylogeny within Cetartiodactyla. We addressed this problem by conducting two analyses: (1) a simultaneous cladistic analysis of intrinsic data (morphology and molecules) and (2) a stratocladistic analysis, which included morphological, molecular, and stratigraphic data. Unlike previous simultaneous analyses, we had the opportunity to include data from the recently described hindlimbs of protocetid and pakicetid cetaceans. Our intrinsic dataset includes 73 taxa scored for 8,229 informative characters, of which 208 are morphological and 8,021 molecular. Both analyses supported the exclusion of Mesonychia from Cetartiodactyla and a close phylogenetic relationship between Hippopotamidae and Cetacea. Many polytomies in the strict consensus of the most parsimonious trees for the intrinsic dataset can be attributed to differing positions for Raoellidae, which in some trees is the sister-group to Cetacea. Pruning Raoellidae and 18 other taxa from all most parsimonious produced a fully resolved agreement subtree, which indicates that the Old World taxa Cebochoerus and Mixtotherium are successive stem taxa to Whippomorpha (i.e., Cetacea + Hippopotamidae). The main result of adding stratigraphic information to the intrinsic dataset was that we found fewer most parsimonious trees, which in most respects were congruent with a subset of the shortest trees for the intrinsic dataset. Our stratocladistic analysis supports species of Diacodexis as the most basal cetartiodactyls, a clade of suiform cetartiodactyls, a monophyletic Tylopoda that includes Protoceratidae, and a monophyletic Carnivora. We were unable to identify any pre-Miocene stem taxa to Hippopotamidae, thus its ghost lineage is still 39 million years long. The relatively low Bremer support for many nodes in our trees indicates that our phylogenetic hypotheses should be subjected to further testing.  相似文献   

14.
Pol D 《Systematic biology》2004,53(6):949-962
Advocates of maximum likelihood (ML) approaches to phylogenetics commonly cite as one of their primary advantages the use of objective statistical criteria for model selection. Currently, a particular implementation of the likelihood ratio test (LRT) is the most commonly used model-selection criterion in phylogenetics. This approach requires the choice of a starting point and a parameter addition (or removal) sequence that can affect all ML inferences (i.e., topology, model, and all evolutionary parameters). Here, several alternative starting points and parameter sequences are tested in empirical data sets to assess their influence on model selection and optimal topology. In the studied data sets, varying model-selection protocols leads to selection of different models that, in some cases, lead to different ML trees. Given the sensitivity of the LRT, some possible solutions to model selection (within the hypothesis testing approach) are outlined, and alternative model-selection criteria are discussed. Some of the suggested alternatives seem to lack these problems, although their behavior and adequacy for phylogenetics needs to be further explored.  相似文献   

15.
16.
THE EFFECT OF ORDERED CHARACTERS ON PHYLOGENETIC RECONSTRUCTION   总被引:2,自引:0,他引:2  
Abstract Morphological structures are likely to undergo more than a single change during the course of evolution. As a result, multistate characters are common in systematic studies and must be dealt with. Particularly interesting is the question of whether or not multistate characters should be treated as ordered (additive) or unordered (non-additive). In accepting a particular hypothesis of order, numerous others are necessarily rejected. We review some of the criteria often used to order character states and the underlying assumptions inherent in these criteria.
The effects that ordered multistate characters can have on phylogenetic reconstruction are examined using 27 data sets. It has been suggested that hypotheses of character state order are more informative then hypotheses of unorder and may restrict the number of equally parsimonious trees as well as increase tree resolution. Our results indicate that ordered characters can produce more, equal or less equally parsimonious trees and can increase, decrease or have no effect on tree resolution. The effect on tree resolution can be a simple gain in resolution or a dramatic change in sister-taxa relationships. In cases where several outgroups are included in the data matrix, hypotheses of order can change character polarities by altering outgroup topology. Ordered characters result in a different topology from unordered characters only when the hierarchy of the cladogram disagrees with the investigator's a priori hypothesis of order. If the best criterion for assessing character evolution is congruence with other characters, the practice of ordering multistate characters is inappropriate.  相似文献   

17.
The behavior of two topological and four character‐based congruence measures was explored using different indel treatments in three empirical data sets, each with different alignment difficulties. The analyses were done using direct optimization within a sensitivity analysis framework in which the cost of indels was varied. Indels were treated either as a fifth character state, or strings of contiguous gaps were considered single events by using linear affine gap cost. Congruence consistently improved when indels were treated as single events, but no congruence measure appeared as the obviously preferable one. However, when combining enough data, all congruence measures clearly tended to select the same alignment cost set as the optimal one. Disagreement among congruence measures was mostly caused by a dominant fragment or a data partition that included all or most of the length variation in the data set. Dominance was easily detected, as the character‐based congruence measures approached their optimal value when indel costs were incremented. Dominance of a fragment or data partition was overwhelmed when new sequence length‐variable fragments or data partitions were added. © The Willi Hennig Society 2005.  相似文献   

18.
We inferred the phylogeny of 33 species of ticks from the subfamilies Rhipicephalinae and Hyalomminae from analyses of nuclear and mitochondrial DNA and morphology. We used nucleotide sequences from 12S rRNA, cytochrome c oxidase I, internal transcribed spacer 2 of the nuclear rRNA, and 18S rRNA. Nucleotide sequences and morphology were analyzed separately and together in a total-evidence analysis. Analyses of the five partitions together (3303 characters) gave the best-resolved and the best-supported hypothesis so far for the phylogeny of ticks in the Rhipicephalinae and Hyalomminae, despite the fact that some partitions did not have data for some taxa. However, most of the hidden conflict (lower support in the total-evidence analyses compared to that in the individual analyses) was found in those partitions that had taxa without data. The partitions with complete taxonomic sampling had more hidden support (higher support in the total-evidence analyses compared to that in the separate-partition analyses) than hidden conflict. Mapping of geographic origins of ticks onto our phylogeny indicates an African origin for the Rhipicephalinae sensu lato (i.e., including Hyalomma spp.), the Rhipicephalus-Boophilus lineage, the Dermacentor-Anocentor lineage, and the Rhipicephalus-Booophilus-Nosomma-Hyalomma-Rhipicentor lineage. The Nosomma-Hyalomma lineage appears to have evolved in Asia. Our total-evidence phylogeny indicates that (i) the genus Rhipicephalus is paraphyletic with respect to the genus Boophilus, (ii) the genus Dermacentor is paraphyletic with respect to the genus Anocentor, and (iii) some subgenera of the genera Hyalomma and Rhipicephalus are paraphyletic with respect to other subgenera in these genera. Study of the Rhipicephalinae and Hyalomminae over the last 7 years has shown that analyses of individual datasets (e.g., one gene or morphology) seldom resolve many phylogenetic relationships, but analyses of more than one dataset can generate well-resolved phylogenies for these ticks.  相似文献   

19.
Collections of phylogenetic trees are usually summarized using consensus methods. These methods build a single tree, supposed to be representative of the collection. However, in the case of heterogeneous collections of trees, the resulting consensus may be poorly resolved (strict consensus, majority-rule consensus, ...), or may perform arbitrary choices among mutually incompatible clades, or splits (greedy consensus). Here, we propose an alternative method, which we call the multipolar consensus (MPC). Its aim is to display all the splits having a support above a predefined threshold, in a minimum number of consensus trees, or poles. We show that the problem is equivalent to a graph-coloring problem, and propose an implementation of the method. Finally, we apply the MPC to real data sets. Our results indicate that, typically, all the splits down to a weight of 10% can be displayed in no more than 4 trees. In addition, in some cases, biologically relevant secondary signals, which would not have been present in any of the classical consensus trees, are indeed captured by our method, indicating that the MPC provides a convenient exploratory method for phylogenetic analysis. The method was implemented in a package freely available at http://www.lirmm.fr/~cbonnard/MPC.html  相似文献   

20.
We performed different consensus methods by combining binary classifiers, mostly machine learning classifiers, with the aim to test their capability as predictive tools for the presence–absence of marine phytoplankton species. The consensus methods were constructed by considering a combination of four methods (i.e., generalized linear models, random forests, boosting and support vector machines). Six different consensus methods were analyzed by taking into account six different ways of combining single-model predictions. Some of these methods are presented here for the first time. To evaluate the performance of the models, we considered eight phytoplankton species presence–absence data sets and data related to environmental variables. Some of the analyzed species are toxic, whereas others provoke water discoloration, which can cause alarm in the population. Besides the phytoplankton data sets, we tested the models on 10 well-known open access data sets. We evaluated the models' performances over a test sample. For most (72%) of the data sets, a consensus method was the method with the lowest classification error. In particular, a consensus method that weighted single-model predictions in accordance with single-model performances (weighted average prediction error — WA-PE model) was the one that presented the lowest classification error most of the time. For the phytoplankton species, the errors of the WA-PE model were between 10% for the species Akashiwo sanguinea and 38% for Dinophysis acuminata. This study provides novel approaches to improve the prediction accuracy in species distribution studies and, in particular, in those concerning marine phytoplankton species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号