首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The problem of testing for congruence between phylogenetic data has long been debated among phylogeneticists, but reaches a critical point with the availability of large amount of biological sequences. Notably in prokaryotes, where the amount of lateral transfers is believed to be important, the inference of phylogenies using multiple genes requires testing for incongruence before concatenating the genes. On another scale, incongruence tests can be used to detect recombination points within single gene alignments. The incongruence length difference test (ILD), based on parsimony, has been proved to be useful for finding incongruent data sets, but its application remains limited to small data sets for computational time reasons. Here, we have adapted the principle of ILD to the BIONJ algorithm. This algorithm is based on a tree length minimisation criterion and is suitable to replace parsimony in this test when used with uncorrected distance (model-free approach). We show that this new test, ILD-BIONJ, while being much faster, is often more accurate than the ILD test, especially when the alignments compared are simulated under different evolutionary models.  相似文献   

2.
The incongruence length difference (ILD) test is prone to suggesting significant conflict between character partitions when these differ only in the amount of undirected homoplasy (noise). This has been shown to be due to nonlinearity in the relationship between tree length and noise. Here we show that by standardizing either tree length or 1-retention index on a 0-to-1 scale, and then taking the arcsine of the value, the resulting value is linearly related to noise except at extremely high noise levels. We then investigate the effect of substituting these values instead of raw tree metrics in a modified ILD test (here called arcsine-ILD) for two types of noise. We show that, using the modified metric instead of the raw length, the results of ILD tests agreed better with desirable properties.  相似文献   

3.
Molecular phylogeny of the species Escherichia coli using the E. coli reference (ECOR) collection strains has been hampered by (1) the absence of rooting in the commonly used phenogram obtained from multilocus enzyme electrophoresis (MLEE) data and (2) the existence of recombination events between strains that scramble phylogenetic trees reconstructed from the nucleotide sequences of genes. We attempted to determine the phylogeny for E. coli based on the ECOR strain data by extracting from GenBank the nucleotide sequences of 11 chromosomal structural and 2 plasmid genes for which the Salmonella enterica homologous gene sequences were available. For each of the 13 DNA data sets studied, incongruence with a nonnucleotide whole-genome data set including MLEE, random amplified polymorphic DNA, and rrn restriction fragment length polymorphism data was measured using the incongruence length difference (ILD) test of Farris et al. As previously reported, the incongruence observed between the gnd and plasmid gene data and the whole-genome data was multiple, indicating numerous horizontal transfer and/or recombination events. In five cases, the incongruence detected by the ILD test was punctual, and the donor group was identified. Congruence was not rejected for the remaining data sets. The strains responsible for incongruences with the whole-genome data set were removed, leading to a "prior-agreement" approach, i.e., the determination of a phylogeny for E. coli based on several genes, excluding (1) the genes with multiple incongruences with the whole genome data, (2) the strains responsible for punctual incongruences, and (3) the genes incongruent with each other. The obtained phylogeny shows that the most basal group of E. coli strains is the B2 group rather than the A group, as generally thought. The D group then emerges as the sister group of the rest. Finally, the A and B1 groups are sister groups. Interestingly, the most primitive taxon within E. coli in terms of branching pattern, i.e., the B2 group, includes highly virulent extraintestinal strains with derived characters (extraintestinal virulence determinants) occurring on its own branch.   相似文献   

4.
SUMMARY: Pairwise comparisons of disagreement in phylogenetic datasets offer a powerful tool for isolating historical incongruence for closer analysis. Statistically significant phylogenetic character incongruence may reflect important differences in evolutionary history, such as horizontal gene transfer. Such testing can also be used to specify possible combinations of datasets for further phylogenetic analysis. The process of comparing multiple datasets can be very time consuming, and it is sometimes unclear how to combine data partitions given the observed patterns of incongruence. Here we present an application that automates the process of making pairwise comparisons between large numbers of phylogenetic datasets using the Incongruence Length Difference (ILD) test. The application also implements strategies for data combination based on the patterns of incongruence observed in pairwise comparisons.  相似文献   

5.
Comprehensive phylogenetic analyses utilize data from distinct sources, including nuclear, mitochondrial, and plastid molecular sequences and morphology. Such heterogeneous datasets are likely to require distinct models of analysis, given the different histories of mutational biases operating on these characters. The incongruence length difference (ILD) test is increasingly being used to arbitrate between competing models of phylogenetic analysis in cases where multiple data partitions have been collected. Our work suggests that the ILD test is unlikely to be an effective measure of congruence when two datasets differ markedly in size. We show that models that increase the contribution of one data partition over another are likely to increase congruence, as measured by this test. More alarmingly, for many bipartition comparisons, character congruence increases bimodally - either increasing or decreasing the contribution of one data partition will increase congruence - making it impossible to arrive at a single optimally congruent model of analysis.  相似文献   

6.
Particular serovars of Salmonella enterica have emerged as significant foodborne pathogens in humans. At the chromosomal level, discrete regions in the Salmonella genome have been identified that are known to play important roles in the maintenance, survival, and virulence of S. enterica within the host. Interestingly, several of these loci appear to have been acquired by horizontal transfer of DNA among and between bacterial species. The profound importance of recombination in pathogen emergence is just now being realized, perhaps explaining the sudden interest in developing novel and facile ways for detecting putative horizontal transfer events in bacteria. The incongruence length difference (ILD) test offers one such means. ILD uses phylogeny to trace sequences that may have been acquired promiscuously by exchange of DNA during chromosome evolution. We show here that the ILD test readily detects recombinations that have taken place in several housekeeping genes in Salmonella as well as genes composing the type 1 pilin complex (14 min) and the inv-spa invasion gene complex (63 min). Moreover, the ILD test indicated that the mutS gene (64 min), whose product helps protect the bacterial genome from invasion by foreign DNA, appears to have undergone intragenic recombination within S. enterica subspecies I. ILD findings were supported using additional tests known to be independent of the ILD approach (e.g., split decomposition analysis and compatibility of sites). Taken together, these data affirm the application of the ILD test as one approach for identifying recombined sequences in the Salmonella chromosome. Furthermore, horizontally acquired sequences within mutS support a model whereby evolutionarily important recombinants of S. enterica are rescued from strains carrying defective mutS alleles via horizontal transfer.  相似文献   

7.
The incongruence length difference (ILD) test may produce artificially large significance values with the addition not only of uninformative characters, but also of informative characters not relevant to the groups in conflict. Previously reported problems with the ILD test involved cases of false positives, reporting high incongruence when none is expected. Under certain conditions, the test can suffer with the opposite problem (false negatives), reporting non‐significant values in cases of high incongruence. These opposing effects can be combined in a data set, such that a comparison over all partitions appears as congruent, while some of the pair‐wise comparisons are reported as significantly incongruent. © The Willi Hennig Society 2006.  相似文献   

8.
Tests for incongruence as an indicator of among-data partition conflict have played an important role in conditional data combination. When such tests reveal significant incongruence, this has been interpreted as a rationale for not combining data into a single phylogenetic analysis. In this study of lorisiform phylogeny, we use the incongruence length difference (ILD) test to assess conflict among three independent data sets. A large morphological data set and two unlinked molecular data sets--the mitochondrial cytochrome b gene and the nuclear interphotoreceptor retinoid binding protein (exon 1)--are analyzed with various optimality criteria and weighting mechanisms to determine the phylogenetic relationships among slow lorises (Primates, Loridae). When analyzed separately, the morphological data show impressive statistical support for a monophyletic Loridae. Both molecular data sets resolve the Loridae as paraphyletic, though with different branching orders depending on the optimality criterion or character weighting used. When the three data partitions are analyzed in various combinations, an inverse relationship between congruence and phylogenetic accuracy is observed. Nearly all combined analyses that recover monophyly indicate strong data partition incongruence (P = 0.00005 in the most extreme case), whereas all analyses that recover paraphyly indicate lack of significant incongruence. Numerous lines of evidence verify that monophyly is the accurate phylogenetic result. Therefore, this study contributes to a growing body of information affirming that measures of incongruence should not be used as indicators of data set combinability.  相似文献   

9.
This paper examines the efficiency of the incongruence length difference test (ILD) proposed by Farris et al. (1994) for assessing the incongruence between sets of characters. DNA sequences were simulated under various evolutionary conditions: (1) following symmetric or asymmetric trees, (2) with various mutation rates, (3) with constant or variable evolutionary rates along the branches, and (4) with different among-site substitution rates. We first compared two sets of sequences generated along the same tree and under the same evolutionary conditions. The probability of a Type-I error (wrongly rejecting the true hypothesis of congruence) was substantially below the standard 5% level of significance given by the ILD test; this finding indicates that the choice of the 5% level is rather conservative in this case. We then compared two data sets, still generated along the same tree, but under different evolutionary conditions (constant vs. variable evolutionary rate, homogeneity vs. heterogeneity rate of substitution). Under these conditions, the probability of rejecting the true hypothesis of congruence was greater than the 5% given by the ILD test and increased with the number of sites and the degree to which the tree was asymmetric. Finally, the comparison of the two data sets, simulated under contrasting tree structures (symmetric vs. asymmetric) but under the same evolutionary conditions, led us to reject the hypothesis of congruence, albeit weakly, particularly when the number of informative sites was low and among-site substitution rate heterogeneous. We conclude that the ILD test has only limited power to detect incongruence caused by differences in the evolutionary conditions or in the tree topology, except when numerous characters are present and the substitution rate is homogeneous from site to site.  相似文献   

10.
The Channichthyidae is a lineage of 16 species in the Notothenioidei, a clade of fishes that dominate Antarctic near-shore marine ecosystems with respect to both diversity and biomass. Among four published studies investigating channichthyid phylogeny, no two have produced the same tree topology, and no published study has investigated the degree of phylogenetic incongruence between existing molecular and morphological datasets. In this investigation we present an analysis of channichthyid phylogeny using complete gene sequences from two mitochondrial genes (ND2 and 16S) sampled from all recognized species in the clade. In addition, we have scored all 58 unique morphological characters used in three previous analyses of channichthyid phylogenetic relationships. Data partitions were analyzed separately to assess the amount of phylogenetic resolution provided by each dataset, and phylogenetic incongruence among data partitions was investigated using incongruence length difference (ILD) tests. We utilized a parsimony-based version of the Shimodaira-Hasegawa test to determine if alternative tree topologies are significantly different from trees resulting from maximum parsimony analysis of the combined partition dataset. Our results demonstrate that the greatest phylogenetic resolution is achieved when all molecular and morphological data partitions are combined into a single maximum parsimony analysis. Also, marginal to insignificant incongruence was detected among data partitions using the ILD. Maximum parsimony analysis of all data partitions combined results in a single tree, and is a unique hypothesis of phylogenetic relationships in the Channichthyidae. In particular, this hypothesis resolves the phylogenetic relationships of at least two species (Channichthys rhinoceratus and Chaenocephalus aceratus), for which there was no consensus among the previous phylogenetic hypotheses. The combined data partition dataset provides substantial statistical power to discriminate among alternative hypotheses of channichthyid relationships. These findings suggest the optimal strategy for investigating the phylogenetic relationships of channichthyids is one that uses all available phylogenetic data in analyses of combined data partitions.  相似文献   

11.
Measuring Topological Congruence by Extending Character Techniques   总被引:1,自引:0,他引:1  
A measure of topological congruence which is an extension of the Mickevich–Farris character incongruence metric ( i.e. , ILD; Mickevich and Farris, 1981) is proposed. Group inclusion characters (1 = member of a clade; 0 = not a member) are constructed for each topology to be considered. The sets of characters derived from the topologies are then compared for character incongruence due to data set combination. Each homoplasy signifies a disagreement among topological statements. The value is normalized for potential maximum incongruence to adjust values for unresolved topologies. This measure is compared to other topological and character congruence techniques and explored in test data.  相似文献   

12.
Analyses of DNA sequence datasets have repeatedly revealed inconsistencies in phylogenetic trees derived with different data. This is termed phylogenetic incongruence, and may arise from a methodological failure of the inference process or from biological processes, such as horizontal gene transfer, incomplete lineage sorting, and introgression. To better understand patterns of incongruence, we developed a method (PartFinder) that uses likelihood ratios applied to sliding windows for visualizing tree-support changes across genome-sequence alignments, allowing the comparative examination of complex phylogenetic scenarios among many species. As a pilot, we used PartFinder to investigate incongruence in the Homo-Pan-Gorilla group as well as Platyrrhini using high-quality bacterial artificial chromosome (BAC)-derived sequences as well as assembled whole-genome shotgun sequences. Our simulations verified the sensitivity of PartFinder, and our results were comparable to other studies of the Homo-Pan-Gorilla group. Analyses of the whole-genome alignments reveal significant associations between support for the accepted species relationship and specific characteristics of the genomic regions, such as GC-content, alignment score, exon content, and conservation. Finally, we analyzed sequence data generated for five platyrrhine species, and found incongruence that suggests a polytomy within Cebidae, in particular. Together, these studies demonstrate the utility of PartFinder for investigating the patterns of phylogenetic incongruence.  相似文献   

13.
To investigate the origins of incongruence among mammalian mitochondrial protein-coding genes, we compiled a matrix that included 13 protein-coding-genes for 41 mammals from 14 different orders. This matrix was examined for congruence using different partitioning strategies. The incongruence length difference test showed significant incongruence among the 13 gene partitions used simultaneously, and the result was not affected by third codon or transversion weighting. In the pair-wise comparisons, significant incongruence was detected between NADH:ubiquinone oxidoreductase subunit 6 gene (ND6), cytochrome oxidase subunit II (COII), or cytochrome oxidase subunit III (COIII) gene partitioned individually against the rest of the genes. Omission of any of the 14 mammalian orders alone or in combinations from the matrix did not result in a statistically significant improvement of congruence, suggesting that taxonomic sampling will not improve congruence among the data sets. However, omission of the ND6, COII, and COIII significantly improved congruence in our data matrix. Possible origins of unusual phylogenetic properties of the three genes are discussed.  相似文献   

14.
Data incongruence and the problem of avian louse phylogeny   总被引:2,自引:0,他引:2  
Smith, V. S., Page, R. D. M. & Johnson, K. P. (2004). Data incongruence and the problem of avian louse phylogeny. — Zoologica Scripta, 33 , 239 –259.
Recent studies based on different types of data (i.e. morphological and molecular) have supported conflicting phylogenies for the genera of avian feather lice (Ischnocera: Phthiraptera). We analyse new and published data from morphology and from mitochondrial (12S rRNA and COI) and nuclear (EF1-α) genes to explore the sources of this incongruence and explain these conflicts. Character convergence, multiple substitutions at high divergences, and ancient radiation over a short period of time have contributed to the problem of resolving louse phylogeny with the data currently available. We show that apparent incongruence between the molecular datasets is largely attributable to rate variation and nonstationarity of base composition. In contrast, highly significant character incongruence leads to topological incongruence between the molecular and morphological data. We consider ways in which biases in the sequence data could be misleading, using several maximum likelihood models and LogDet corrections. The hierarchical structure of the data is explored using likelihood mapping and SplitsTree methods. Ultimately, we concede there is strong discordance between the molecular and morphological data and apply the conditional combination approach in this case. We conclude that higher level phylogenetic relationships within avian Ischnocera remain extremely problematic. However, consensus between datasets is beginning to converge on a stable phylogeny for avian lice, at and below the familial rank.  相似文献   

15.
Can three incongruence tests predict when data should be combined?   总被引:31,自引:14,他引:17  
Advocates of conditional combination have argued that testing for incongruence between data partitions is an important step in data exploration. Unless the partitions have had distinct histories, as in horizontal gene transfer, incongruence means that one or more data support the wrong phylogeny. This study examines the relationship between incongruence and phylogenetic accuracy using three tests of incongruence. These tests were applied to pairs of mitochondrial DNA data partitions from two well-corroborated vertebrate phylogenies. Of the three tests, the most useful was the incongruence length difference test (ILD, also called the partition homogeneity test). This test distinguished between cases in which combining the data generally improved phylogenetic accuracy (P > 0.01) and cases in which accuracy of the combined data suffered relative to the individual partitions (P < 0.001). In contrast, in several cases, the Templeton and Rodrigo tests detected highly significant incongruence (P < 0.001) even though combining the incongruent partitions actually increased phylogenetic accuracy. All three tests identified cases in which improving the reconstruction model would improve the phylogenetic accuracy of the individual partitions.   相似文献   

16.
We examined three parallel data sets with respect to qualities relevant to phylogenetic analysis of 20 exemplar monocotyledons and related dicotyledons. The three data sets represent restriction-site variation in the inverted repeat region of the chloroplast genome, and nucleotide sequence variation in the chloroplast-encoded gene rbcL and in the mitochondrion-encoded gene atpA, the latter of which encodes the alpha-subunit of mitochondrial ATP synthase. The plant mitochondrial genome has been little used in plant systematics, in part because nucleotide sequence evolution in enzyme-encoding genes of this genome is relatively slow. The three data sets were examined in separate and combined analyses, with a focus on patterns of congruence, homoplasy, and data decisiveness. Data decisiveness (described by P. Goloboff) is a measure of robustness of support for most parsimonious trees by a data set in terms of the degree to which those trees are shorter than the average length of all possible trees. Because indecisive data sets require relatively fewer additional steps than decisive ones to be optimized on nonparsimonious trees, they will have a lesser tendency to be incongruent with other data sets. One consequence of this relationship between decisiveness and character incongruence is that if incongruence is used as a criterion of noncombinability, decisive data sets, which provide robust support for relationships, are more likely to be assessed as noncombinable with other data sets than are indecisive data sets, which provide weak support for relationships. For the sampling of taxa in this study, the atpA data set has about half as many cladistically informative nucleotides as the rbcL data set per site examined, and is less homoplastic and more decisive. The rbcL data set, which is the least decisive of the three, exhibits the lowest levels of character incongruence. Whatever the molecular evolutionary cause of this phenomenon, it seems likely that the poorer performance of rbcL than atpA, in terms of data decisiveness, is due to both its higher overall level of homoplasy and the fact that it is performing especially poorly at nonsynonymous sites.  相似文献   

17.
18.
Total evidence requires exclusion of phylogenetically misleading data   总被引:8,自引:1,他引:7  
Treating all available characters simultaneously in a single data matrix (i.e. combined or simultaneous analysis) is frequently called the 'total evidence' (TE) approach, following Kluge's introduction of the term in 1989, quoting Carnap (1950) . However, the general principle and one of the possible procedures involved in its application are often confused. The principle, first enunciated within the context of inductive logic by Carnap in 1950, did not refer to a particular procedure, and TE meant using all relevant knowledge, rather than a combined analysis of all available data. Using TE, all relevant knowledge should be taken into account, including the fact that some data are probably misleading as indicators of species phylogeny and should be discarded. Based on the assumption that molecular partitions have some biological significance (process partitions obtained from nonrandom homoplasy or from 'processes of discord'), we suggest that separate analyses constitute an important exploratory investigation, while the phylogenetic tree itself should be produced by a final combined analysis of all relevant data. Given that the concept of process partitions is justified and that reliability cannot be evaluated using any robustness measure from a single combined analysis, the analysis of multiple data sets involves five steps: (1) perform separate analyses without consensus trees in order to assess reliability of clades through their recurrence and improve the detection of artifacts; (2) test significance of character incongruence, using, for example, pairwise ILD tests in order to identify the sets responsible for incongruence; (3) replace likely misleading data with question marks in the combined data matrix; (4) perform simultaneous analysis of this matrix without the misleading data; (5) assess the reliability of clades found by the combined analysis by computing their recurrence within the previous separate analyses, giving priority to repeatability.  相似文献   

19.
Previous molecular phylogenies of European cyprinids led to some solid facts and some uncertainties. This study is based on a stretch of more than 1 kb in the mitochondrial control region newly sequenced for 35 European cyprinids and on previous cytochrome b and 16S rDNA data. The trees based on the control region are more accurate and robust than those obtained from the two other genes. Character incongruence among the three genes was tested using the incongruence length difference (ILD) test. Iterative removals of individual sequences followed by new ILD tests identified two sequences responsible for statistically significant incongruence. A partial combination was conducted, that is, a combination of the three data sets, removing the two sequences previously identified. The phylogenetic analysis of this partial combination gives a more robust and resolved picture of subfamilial interrelationships. The Rasborinae are the sister group of all other cyprinids. The monophyletic Cyprininae emerges next. Tinca tinca first and then Rhodeus are the sister groups of all the remaining nonrasborine and noncyprinine species. Gobio is the sister group of the Leuciscinae, in which the Phoxinini are the sister group of the Leuciscini. Within the Leuciscini, the genus Leuciscus and the subfamily Alburninae are both paraphyletic. The Rasborinae are the most basal cyprinid subfamily and the Tincinae are not the sister group of the Cyprininae. These two results challenge only two anatomical characters, which need to be reinterpreted or considered as homoplastic in cyprinid evolution: the modification of the first pleural rib and its parapophysis and the bony composition of the interorbital septum.  相似文献   

20.
The human papillomaviruses (HPVs) have long been thought to follow a monophyletic pattern of evolution with little if any evidence for recombination between genomes. On the basis of this model, both oncogenicity and tissue tropism appear to have evolved once. Still, no systematic statistical analyses have shown whether monophyly is the rule across all HPV open reading frames (ORFs). We conducted a taxonomic analysis of 59 mucosal/genital HPVs using whole-genome and sliding-window similarity measures; maximum-parsimony, neighbor-joining, and Bayesian phylogenetic analyses; and localized incongruence length difference (LILD) analyses. The algorithm for the LILD analyses localized incongruence by calculating the tree length differences between constrained and unconstrained nodes in a total-evidence tree across all HPV ORFs. The process allows statistical evaluation of every ORF/node pair in the total-evidence tree. The most significant incongruence was observed at the putative high-risk (i.e., cancer-associated) node, the common oncogenic ancestor for alpha HPV species 9 (e.g., HPV type 16 [HPV16]), 11, 7 (e.g., HPV18), 5, and 6. Although these groups share early-gene homology, including high degrees of similarity among E6 and E7, groups 9 and 11 diverge from groups 7, 5, and 6 with respect to L2 and L1. The HPV species groups primarily associated with cervical and anogenital cancers appear to follow two distinct evolutionary paths, one conferred by the early genes and another by the late genes. The incongruence in the genital HPV phylogeny could have occurred from an early recombination event, an ecological niche change, and/or asymmetric genome convergence driven by intense selection. These data indicate that the phylogeny of the oncogenic HPVs is complex and that their evolution may not be monophyletic across all genes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号