首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Measuring Topological Congruence by Extending Character Techniques   总被引:1,自引:0,他引:1  
A measure of topological congruence which is an extension of the Mickevich–Farris character incongruence metric ( i.e. , ILD; Mickevich and Farris, 1981) is proposed. Group inclusion characters (1 = member of a clade; 0 = not a member) are constructed for each topology to be considered. The sets of characters derived from the topologies are then compared for character incongruence due to data set combination. Each homoplasy signifies a disagreement among topological statements. The value is normalized for potential maximum incongruence to adjust values for unresolved topologies. This measure is compared to other topological and character congruence techniques and explored in test data.  相似文献   

2.
Incongruence between data sets is an important concept in molecular phylogenetics and is commonly measured by the incongruence length difference (ILD) test (J. S. Farris et al., Cladistics 10, 315-319). The ILD test has been used to infer specific evolutionary events and to determine whether to combine data sets for phylogenetic analysis. However, the interpretation in the literature of the test's results varies because authors have conflicting expectations of the effect that noise will have. Using simulations we demonstrate that noise can by itself generate highly significant results in the ILD test and demonstrate why this is the case. To clarify the interpretation of test results, we suggest an additional procedure in which the result is compared against a frequency distribution generated from completely shuffled data. As examples, we apply this approach to two previous studies that have reported incongruence.  相似文献   

3.
This paper examines the efficiency of the incongruence length difference test (ILD) proposed by Farris et al. (1994) for assessing the incongruence between sets of characters. DNA sequences were simulated under various evolutionary conditions: (1) following symmetric or asymmetric trees, (2) with various mutation rates, (3) with constant or variable evolutionary rates along the branches, and (4) with different among-site substitution rates. We first compared two sets of sequences generated along the same tree and under the same evolutionary conditions. The probability of a Type-I error (wrongly rejecting the true hypothesis of congruence) was substantially below the standard 5% level of significance given by the ILD test; this finding indicates that the choice of the 5% level is rather conservative in this case. We then compared two data sets, still generated along the same tree, but under different evolutionary conditions (constant vs. variable evolutionary rate, homogeneity vs. heterogeneity rate of substitution). Under these conditions, the probability of rejecting the true hypothesis of congruence was greater than the 5% given by the ILD test and increased with the number of sites and the degree to which the tree was asymmetric. Finally, the comparison of the two data sets, simulated under contrasting tree structures (symmetric vs. asymmetric) but under the same evolutionary conditions, led us to reject the hypothesis of congruence, albeit weakly, particularly when the number of informative sites was low and among-site substitution rate heterogeneous. We conclude that the ILD test has only limited power to detect incongruence caused by differences in the evolutionary conditions or in the tree topology, except when numerous characters are present and the substitution rate is homogeneous from site to site.  相似文献   

4.
Molecular phylogeny of the species Escherichia coli using the E. coli reference (ECOR) collection strains has been hampered by (1) the absence of rooting in the commonly used phenogram obtained from multilocus enzyme electrophoresis (MLEE) data and (2) the existence of recombination events between strains that scramble phylogenetic trees reconstructed from the nucleotide sequences of genes. We attempted to determine the phylogeny for E. coli based on the ECOR strain data by extracting from GenBank the nucleotide sequences of 11 chromosomal structural and 2 plasmid genes for which the Salmonella enterica homologous gene sequences were available. For each of the 13 DNA data sets studied, incongruence with a nonnucleotide whole-genome data set including MLEE, random amplified polymorphic DNA, and rrn restriction fragment length polymorphism data was measured using the incongruence length difference (ILD) test of Farris et al. As previously reported, the incongruence observed between the gnd and plasmid gene data and the whole-genome data was multiple, indicating numerous horizontal transfer and/or recombination events. In five cases, the incongruence detected by the ILD test was punctual, and the donor group was identified. Congruence was not rejected for the remaining data sets. The strains responsible for incongruences with the whole-genome data set were removed, leading to a "prior-agreement" approach, i.e., the determination of a phylogeny for E. coli based on several genes, excluding (1) the genes with multiple incongruences with the whole genome data, (2) the strains responsible for punctual incongruences, and (3) the genes incongruent with each other. The obtained phylogeny shows that the most basal group of E. coli strains is the B2 group rather than the A group, as generally thought. The D group then emerges as the sister group of the rest. Finally, the A and B1 groups are sister groups. Interestingly, the most primitive taxon within E. coli in terms of branching pattern, i.e., the B2 group, includes highly virulent extraintestinal strains with derived characters (extraintestinal virulence determinants) occurring on its own branch.   相似文献   

5.
Comprehensive phylogenetic analyses utilize data from distinct sources, including nuclear, mitochondrial, and plastid molecular sequences and morphology. Such heterogeneous datasets are likely to require distinct models of analysis, given the different histories of mutational biases operating on these characters. The incongruence length difference (ILD) test is increasingly being used to arbitrate between competing models of phylogenetic analysis in cases where multiple data partitions have been collected. Our work suggests that the ILD test is unlikely to be an effective measure of congruence when two datasets differ markedly in size. We show that models that increase the contribution of one data partition over another are likely to increase congruence, as measured by this test. More alarmingly, for many bipartition comparisons, character congruence increases bimodally - either increasing or decreasing the contribution of one data partition will increase congruence - making it impossible to arrive at a single optimally congruent model of analysis.  相似文献   

6.
The incongruence length difference (ILD) test is prone to suggesting significant conflict between character partitions when these differ only in the amount of undirected homoplasy (noise). This has been shown to be due to nonlinearity in the relationship between tree length and noise. Here we show that by standardizing either tree length or 1-retention index on a 0-to-1 scale, and then taking the arcsine of the value, the resulting value is linearly related to noise except at extremely high noise levels. We then investigate the effect of substituting these values instead of raw tree metrics in a modified ILD test (here called arcsine-ILD) for two types of noise. We show that, using the modified metric instead of the raw length, the results of ILD tests agreed better with desirable properties.  相似文献   

7.
The problem of testing for congruence between phylogenetic data has long been debated among phylogeneticists, but reaches a critical point with the availability of large amount of biological sequences. Notably in prokaryotes, where the amount of lateral transfers is believed to be important, the inference of phylogenies using multiple genes requires testing for incongruence before concatenating the genes. On another scale, incongruence tests can be used to detect recombination points within single gene alignments. The incongruence length difference test (ILD), based on parsimony, has been proved to be useful for finding incongruent data sets, but its application remains limited to small data sets for computational time reasons. Here, we have adapted the principle of ILD to the BIONJ algorithm. This algorithm is based on a tree length minimisation criterion and is suitable to replace parsimony in this test when used with uncorrected distance (model-free approach). We show that this new test, ILD-BIONJ, while being much faster, is often more accurate than the ILD test, especially when the alignments compared are simulated under different evolutionary models.  相似文献   

8.
Can three incongruence tests predict when data should be combined?   总被引:31,自引:14,他引:17  
Advocates of conditional combination have argued that testing for incongruence between data partitions is an important step in data exploration. Unless the partitions have had distinct histories, as in horizontal gene transfer, incongruence means that one or more data support the wrong phylogeny. This study examines the relationship between incongruence and phylogenetic accuracy using three tests of incongruence. These tests were applied to pairs of mitochondrial DNA data partitions from two well-corroborated vertebrate phylogenies. Of the three tests, the most useful was the incongruence length difference test (ILD, also called the partition homogeneity test). This test distinguished between cases in which combining the data generally improved phylogenetic accuracy (P > 0.01) and cases in which accuracy of the combined data suffered relative to the individual partitions (P < 0.001). In contrast, in several cases, the Templeton and Rodrigo tests detected highly significant incongruence (P < 0.001) even though combining the incongruent partitions actually increased phylogenetic accuracy. All three tests identified cases in which improving the reconstruction model would improve the phylogenetic accuracy of the individual partitions.   相似文献   

9.
SUMMARY: Pairwise comparisons of disagreement in phylogenetic datasets offer a powerful tool for isolating historical incongruence for closer analysis. Statistically significant phylogenetic character incongruence may reflect important differences in evolutionary history, such as horizontal gene transfer. Such testing can also be used to specify possible combinations of datasets for further phylogenetic analysis. The process of comparing multiple datasets can be very time consuming, and it is sometimes unclear how to combine data partitions given the observed patterns of incongruence. Here we present an application that automates the process of making pairwise comparisons between large numbers of phylogenetic datasets using the Incongruence Length Difference (ILD) test. The application also implements strategies for data combination based on the patterns of incongruence observed in pairwise comparisons.  相似文献   

10.
The robustness of clades to parameter variation may be a desirable quality or even a goal in phylogenetic analyses. Sensitivity analyses used to assess clade stability have invoked the incongruence length difference (ILD or WILD) metric, a measure of congruence among datasets, to compare a series of most‐parsimonious results from re‐running analyses under different analytical conditions. It is also common practice to select a single “optimal” parameter set that minimizes WILD across all parameter sets. However, the divergent molecular evolution of ribosomal genes and protein‐encoding genes—specifically the bias against transversion events in coding genes of conserved function—suggests that deployment of multiple parameter sets could outperform the use of a single parameter set applied to all molecules. We explored congruence in five published datasets by including mixed parameter sets in our sensitivity analysis. In four cases, mixed parameter sets outperformed the previously reported, single optimal parameter set. Conversely, multiple parameter sets did not outperform a single optimal parameter set in a case in which actual strong topological conflict exists between data partitions. Exploration of mixed parameter sets may prove useful when combining ribosomal and protein‐encoding genes, due to the relatively higher frequency of single‐ and double‐base pair indel events in the former, and the relatively lower frequency of transversions in the latter.
© The Willi Hennig Society 2010.  相似文献   

11.
Previous molecular phylogenies of European cyprinids led to some solid facts and some uncertainties. This study is based on a stretch of more than 1 kb in the mitochondrial control region newly sequenced for 35 European cyprinids and on previous cytochrome b and 16S rDNA data. The trees based on the control region are more accurate and robust than those obtained from the two other genes. Character incongruence among the three genes was tested using the incongruence length difference (ILD) test. Iterative removals of individual sequences followed by new ILD tests identified two sequences responsible for statistically significant incongruence. A partial combination was conducted, that is, a combination of the three data sets, removing the two sequences previously identified. The phylogenetic analysis of this partial combination gives a more robust and resolved picture of subfamilial interrelationships. The Rasborinae are the sister group of all other cyprinids. The monophyletic Cyprininae emerges next. Tinca tinca first and then Rhodeus are the sister groups of all the remaining nonrasborine and noncyprinine species. Gobio is the sister group of the Leuciscinae, in which the Phoxinini are the sister group of the Leuciscini. Within the Leuciscini, the genus Leuciscus and the subfamily Alburninae are both paraphyletic. The Rasborinae are the most basal cyprinid subfamily and the Tincinae are not the sister group of the Cyprininae. These two results challenge only two anatomical characters, which need to be reinterpreted or considered as homoplastic in cyprinid evolution: the modification of the first pleural rib and its parapophysis and the bony composition of the interorbital septum.  相似文献   

12.
The Channichthyidae is a lineage of 16 species in the Notothenioidei, a clade of fishes that dominate Antarctic near-shore marine ecosystems with respect to both diversity and biomass. Among four published studies investigating channichthyid phylogeny, no two have produced the same tree topology, and no published study has investigated the degree of phylogenetic incongruence between existing molecular and morphological datasets. In this investigation we present an analysis of channichthyid phylogeny using complete gene sequences from two mitochondrial genes (ND2 and 16S) sampled from all recognized species in the clade. In addition, we have scored all 58 unique morphological characters used in three previous analyses of channichthyid phylogenetic relationships. Data partitions were analyzed separately to assess the amount of phylogenetic resolution provided by each dataset, and phylogenetic incongruence among data partitions was investigated using incongruence length difference (ILD) tests. We utilized a parsimony-based version of the Shimodaira-Hasegawa test to determine if alternative tree topologies are significantly different from trees resulting from maximum parsimony analysis of the combined partition dataset. Our results demonstrate that the greatest phylogenetic resolution is achieved when all molecular and morphological data partitions are combined into a single maximum parsimony analysis. Also, marginal to insignificant incongruence was detected among data partitions using the ILD. Maximum parsimony analysis of all data partitions combined results in a single tree, and is a unique hypothesis of phylogenetic relationships in the Channichthyidae. In particular, this hypothesis resolves the phylogenetic relationships of at least two species (Channichthys rhinoceratus and Chaenocephalus aceratus), for which there was no consensus among the previous phylogenetic hypotheses. The combined data partition dataset provides substantial statistical power to discriminate among alternative hypotheses of channichthyid relationships. These findings suggest the optimal strategy for investigating the phylogenetic relationships of channichthyids is one that uses all available phylogenetic data in analyses of combined data partitions.  相似文献   

13.
Introduction Positive predictive value (PPV), measuring the percentage of moderate dyskaryosis or worse confirmed as CIN2 or worse, is used as a measure of accuracy in cervical screening. However, it relates more to specificity than sensitivity because the denominator includes false positives rather than false negatives. Low values reflect over‐reporting of high‐grade dyskaryosis but high values may reflect under‐reporting. Sensitivity is impossible to measure from correlation of cytology with outcome because women with negative cytology are rarely referred for colposcopy. Rates of CIN3 resulting from referrals for low‐grade cytology may be used as a surrogate for sensitivity, as high values may reflect under‐reporting (ref). Study design Outcome of colposcopy referrals was monitored during a period of 4 years, using a fail‐safe database. Results PPV at Guy's & St Thomas rose from 54% in 1998/1999 to 69% in 2001/2002. The former was below the NHSCSP recommended range. During the same period of time CIN1 rates for moderate dyskaryosis fell from 37% to 24%, reflecting the main source of discrepancy. While specificity increased (as reflected by increasing PPV) sensitivity remained constant in that CIN3 rates for mild dyskaryosis and borderline remained below 6%: average rates in England have fallen over the last 3 years and were 7.4% in 2000/2001 (ref). CIN2 rates for mild dyskaryosis also remained constant at 11% to 12%. Conclusion Correlation of biopsy results with high‐ and low‐grade cytological abnormalities is a useful method of monitoring accuracy of cytology reporting, and can be used to measure over‐ and under‐reporting as surrogates for specificity and sensitivity.  相似文献   

14.
Phylogenetic reconstructions of bacterial species from DNA sequences are hampered by the existence of horizontal gene transfer. One possible way to overcome the confounding influence of such movement of genes is to identify and remove sequences which are responsible for significant character incongruence when compared to a reference dataset free of horizontal transfer (e.g., multilocus enzyme electrophoresis, restriction fragment length polymorphism, or random amplified polymorphic DNA) using the incongruence length difference (ILD) test of Farris et al. [Cladistics 10 (1995) 315]. As obtaining this "whole genome dataset" prior to the reconstruction of a phylogeny is clearly troublesome, we have tested alternative approaches allowing the release from such reference dataset, designed for a species with modest level of horizontal gene transfer, i.e., Escherichia coli. Eleven different genes available or sequenced in this work were studied in a set of 30 E. coli reference (ECOR) strains. Either using ILD to test incongruence between each gene against the all remaining (in this case 10) genes in order to remove sequences responsible for significant incongruence, or using just a simultaneous analysis without removals, gave robust phylogenies with slight topological differences. The use of the ILD test remains a suitable method for estimating the level of horizontal gene transfer in bacterial species. Supertrees also had suitable properties to extract the phylogeny of strains, because the way they summarize taxonomic congruence clearly limits the impact of individual gene transfers on the global topology. Furthermore, this work allowed a significant improvement of the accuracy of the phylogeny within E. coli.  相似文献   

15.
Partition-free congruence analysis: implications for sensitivity analysis   总被引:1,自引:0,他引:1  
A criterion is proposed to compare systematic hypotheses based on multiple sources of information under a diverse set of interpretive assumptions (i.e., sensitivity analysis of Wheeler, 1995 ). This metric, the Meta‐Retention Index (MRI), is the retention index (RI) of Farris calculated over the set of conventional homologous qualitative characters (ordered, unordered, Sankoff, etc.) and molecular fragment characters sensu Wheeler (1996, 1999 ). The superiority of this measure to other similar measures (e.g., incongruence length difference test) comes from its independence from partition information. The only values that participate in its calculation are the minimum, maximum and observed cost (= cladogram cost) of each character. The partition (morphology, gene locus) from which the variant may have come is irrelevant. In the special cases where there is only a single data partition, this measure is equivalent to the conventional RI; and in the case where there are single fragment characters per partition (contiguous molecular loci as data sets) the measure is identical to the complement of the Rescaled Incongruence Length Difference (RILD) of Wheeler and Hayashi (1998 ). The MRI can serve as an optimality criterion for deciding among systematic hypotheses based on the same data, but different sets of analysis assumptions (e.g., character weights, indel costs). The MRI may lose discriminatory power in situations where a minority of highly congruent characters is given high weight. This situation can be detected and seems unlikely to occur frequently in real data sets. © The Willi Hennig Society 2006.  相似文献   

16.
Tests for incongruence as an indicator of among-data partition conflict have played an important role in conditional data combination. When such tests reveal significant incongruence, this has been interpreted as a rationale for not combining data into a single phylogenetic analysis. In this study of lorisiform phylogeny, we use the incongruence length difference (ILD) test to assess conflict among three independent data sets. A large morphological data set and two unlinked molecular data sets--the mitochondrial cytochrome b gene and the nuclear interphotoreceptor retinoid binding protein (exon 1)--are analyzed with various optimality criteria and weighting mechanisms to determine the phylogenetic relationships among slow lorises (Primates, Loridae). When analyzed separately, the morphological data show impressive statistical support for a monophyletic Loridae. Both molecular data sets resolve the Loridae as paraphyletic, though with different branching orders depending on the optimality criterion or character weighting used. When the three data partitions are analyzed in various combinations, an inverse relationship between congruence and phylogenetic accuracy is observed. Nearly all combined analyses that recover monophyly indicate strong data partition incongruence (P = 0.00005 in the most extreme case), whereas all analyses that recover paraphyly indicate lack of significant incongruence. Numerous lines of evidence verify that monophyly is the accurate phylogenetic result. Therefore, this study contributes to a growing body of information affirming that measures of incongruence should not be used as indicators of data set combinability.  相似文献   

17.
Particular serovars of Salmonella enterica have emerged as significant foodborne pathogens in humans. At the chromosomal level, discrete regions in the Salmonella genome have been identified that are known to play important roles in the maintenance, survival, and virulence of S. enterica within the host. Interestingly, several of these loci appear to have been acquired by horizontal transfer of DNA among and between bacterial species. The profound importance of recombination in pathogen emergence is just now being realized, perhaps explaining the sudden interest in developing novel and facile ways for detecting putative horizontal transfer events in bacteria. The incongruence length difference (ILD) test offers one such means. ILD uses phylogeny to trace sequences that may have been acquired promiscuously by exchange of DNA during chromosome evolution. We show here that the ILD test readily detects recombinations that have taken place in several housekeeping genes in Salmonella as well as genes composing the type 1 pilin complex (14 min) and the inv-spa invasion gene complex (63 min). Moreover, the ILD test indicated that the mutS gene (64 min), whose product helps protect the bacterial genome from invasion by foreign DNA, appears to have undergone intragenic recombination within S. enterica subspecies I. ILD findings were supported using additional tests known to be independent of the ILD approach (e.g., split decomposition analysis and compatibility of sites). Taken together, these data affirm the application of the ILD test as one approach for identifying recombined sequences in the Salmonella chromosome. Furthermore, horizontally acquired sequences within mutS support a model whereby evolutionarily important recombinants of S. enterica are rescued from strains carrying defective mutS alleles via horizontal transfer.  相似文献   

18.
The total support index (Bremer, 1994), a measure of cladogram support, is influenced both by character incongruence and by uninformative characters. A modified version of this index is therefore proposed—the proportional support index. This index measures the actual support for a cladogram relative to the maximum potential support as determined by the number of informative characters. This index is not distorted by uninformative characters and is thus a more accurate means to compare the strength of phylogenetic signals in different data sets.  相似文献   

19.
Abstract — Gulls (Aves: Larinae) are among the best-studied of birds, yet prior attempts to reconstruct gull relationships have met with little success. In the present study I use 117 characters from the skeleton and 64 from the integument to test gull monophyly and estimate gull phylogeny. One shortest tree, requiring 9747 unweighted changes and having a CI of 0.267, wasLarusis polyphyletic. Although the tree is fully resolved, support for many of the inferred clades is poor. In a comparison of osteological and integumentary evidence, I found that incongruence between the osteological and integumentary character sets accounts for only a minority of the total incongruence observed, and suggest that low between-set incongruence may be a consequence of the low signal-to-noise ratio in each set of characters. I also found that osteological evidence is particularly important for determining higher-level structure, whereas integumentary evidence is important for resolving lower-level relationships within the gull group. Finally, I found that integumentary characters are not dramatically more homoplasious than osteological characters, and argue that casual dismissal of integumentary characters as “too labile” is unwarranted.  相似文献   

20.
A phylogeny of the meiofaunal polychaete family Nerillidae based on morphological, molecular and combined data is presented here. The data sets comprise nearly complete sequences of 18S rDNA and 40 morphological characters of 17 taxa. Sequences were analyzed simultaneously with the morphological data by direct optimization in the program POY, with a variety of parameter sets (costs of gaps: transversions: transitions). Three outgroups were selected from the major polychaete group Aciculata and one from Scolecida. The 13 nerillid species from 11 genera were monophyletic in all analyses with very high support, and three new apomorphies for Nerillidae are identified. The topology of the ingroup varied according to the various parameter settings. Reducing the number of outgroups to one decreased the variance among the phylogenetic hypotheses. The congruence among these was tested and a parameter set, with equal weights (222) and extension gap weighted 1, yielded minimum incongruence (ILD). Several terminal clades of the combined analysis were highly supported, as well as the position of Leptonerilla prospera as sister terminal to the other nerillids. The evolution of morphological characters such as segment numbers, chaetae, appendages and ciliation are traced and discussed. A regressive pathway within Nerillidae is indicated for several characters, however, generally implying several convergent losses. Numerous genera are shown to require revision. © The Willi Hennig Society 2005.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号