首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Partial DNA and amino acid sequences translated from the mitochondrial cytochrome subunit I gene (408 bp) of 17 mite species have been used for analyzing the phylogenetic relationships within the terrestrial Parasitengona (Trombidia). Due to mutational saturation of the third codon position, only first and second codon positions and amino acid sequences were analyzed, applying neighbor-joining, maximum-parsimony, and maximum-likelihood tree-building methods. The reconstructed trees revealed similar topologies of taxa; however, the phylogenetic relationships could be convincingly resolved only within several trombidioid taxa. The proposed basic relationships within the Parasitengona, in particular those of Calyptostomatoidea, Smarididae, and Erythraeidae, were poorly supported in bootstrap tests. A comparison of the presented gene tree with a phylogenetic tree based upon traditional characters revealed only few contradictions in nodes only weakly supported by morphological data. The most astonishing result is the proposed early derivative position of Microtrombidiidae within the terrestrial Parasitengona.  相似文献   

2.
Phylogenetic analyses of first and second codon positions (DNA1 + 2 analysis) and amino acid sequences (protein analysis) are often thought to provide similar estimates of deep-level phylogeny. However, here we report a novel artifact influencing DNA level phylogenetic inference of protein-coding genes introduced by codon usage heterogeneity that causes significant incongruities between DNA1 + 2 and protein analyses. DNA1 + 2 analyses of plastid-encoded psbA genes (encoding of photosystem II D1 proteins) strongly suggest a relationship between haptophyte plastids and typical (peridinin-containing) dinoflagellate plastids. The psbA genes from haptophytes and a subset of the peridinin-type plastids display similar codon usage patterns for Leu, Ser, and Arg, which are each encoded by two separated codon sets that differ at first or first plus second codon positions. Our detailed analyses clearly indicate that these unusual preferences shared by haptophyte and some peridinin-type plastid genes are largely responsible for their strong affinity in DNA analyses. In particular, almost all of the support from DNA level analyses for the monophyly of haptophyte and peridinin-type plastids is lost when the codons corresponding to constant Leu, Ser, and Arg amino acids are excluded, suggesting that this signal comes from rapidly evolving synonymous substitutions, rather than from substitutions that result in amino acid changes. Indeed, protein maximum-likelihood analyses of concatenated PsaA and PsbA amino acid sequences indicate that, although 19' hexanoyloxyfucoxanthin-type (19' HNOF-type) plastids in dinoflagellates group with haptophyte plastids, peridinin-type plastids group weakly with those of stramenopiles. Consequently our results cast doubt on the single origin of peridinin-type and 19' HNOF-type plastids in dinoflagellates previously suggested on the basis of psaA and psbA concatenated gene phylogenetic analyses. We suggest that codon usage heterogeneity could be a more general problem for DNA level analyses of protein-coding genes, even when third codon positions are excluded.  相似文献   

3.
A heuristic approach to search for the maximum-likelihood (ML) phylogenetic tree based on a genetic algorithm (GA) has been developed. It outputs the best tree as well as multiple alternative trees that are not significantly worse than the best one on the basis of the likelihood criterion. These near-optimum trees are subjected to further statistical tests. This approach enables ones to infer phylogenetic trees of over 20 taxa taking account of the rate heterogeneity among sites on practical time scales on a PC cluster. Computer simulations were conducted to compare the efficiency of the present approach with that of several likelihood-based methods and distance-based methods, using amino acid sequence data of relatively large (5–24) taxa. The superiority of the ML method over distance-based methods increases as the condition of simulations becomes more realistic (an incorrect model is assumed or many taxa are involved). This approach was applied to the inference of the universal tree based on the concatenated amino acid sequences of vertically descendent genes that are shared among all genomes whose complete sequences have been reported. The inferred tree strongly supports that Archaea is paraphyletic and Eukarya is specifically related to Crenarchaeota. Apart from the paraphyly of Archaea and some minor disagreements, the universal tree based on these genes is largely consistent with the universal tree based on SSU rRNA. Received: 4 January 2001 / Accepted: 16 May 2001  相似文献   

4.
5.
Character-state space versus rate of evolution in phylogenetic inference   总被引:1,自引:0,他引:1  
With only four alternative character states, parallelisms and reversals are expected to occur frequently when using nucleotide characters for phylogenetic inference. Greater available character‐state space has been described as one of the advantages of third codon positions relative to first and second codon positions, as well as amino acids relative to nucleotides. We used simulations to quantify how character‐state space and rate of evolution relate to one another, and how this relationship is affected by differences in: tree topology, branch lengths, rate heterogeneity among sites, probability of change among states, and frequency of character states. Specifically, we examined how inferred tree lengths, consistency and retention indices, and accuracy of phylogenetic inference are affected. Our results indicate that the relatively small increases in the character‐state space evident in empirical data matrices can provide enormous benefits for the accuracy of phylogenetic inference. This advantage may become more pronounced with unequal probabilities of change among states. Although increased character‐state space greatly improved the accuracy of topology inference, improvements in the estimation of the correct tree length were less apparent. Accuracy and inferred tree length improved most when character‐state space increased initially; further increases provided more modest improvements. © The Willi Hennig Society 2004.  相似文献   

6.
The phylogenetic position of the Pedetidae, represented by a single species Pedetes capensis, is controversial, reflecting in part the retention of both Hystricomorphous and Sciurognathous characteristics in this rodent. In an attempt to clarify the species evolutionary relationships, mtDNA gene sequences from 10 rodent species (representing seven families) were analyzed using phenetic, parsimony, and maximum-likelihood methods of phylogenetic inference; the rabbit, Oryctolagus cuniculus (Order Lagomorpha), and cow, Bos taurus (Order Artiodactyla), were used as outgroups. Investigation of 714 base pairs of the protein-coding cytochrome b gene indicate strong base bias at the third codon position with significant rate heterogeneity evident between the three structural domains of this gene. Similar analyses conducted on 816 base pairs of the 12S rRNA gene revealed a transversion bias in the loop sections of all taxa. The cytochrome b gene sequences proved useful in resolving associations between closely related species but failed to produce consistent tree topologies at the family level. In contrast, phylogenetic analysis of the 12S rRNA gene resulted in strong support for the clustering of Pedetidae/Heteromyidae/Geomyidae and Muridae in one clade to the exclusion of the Hystricidae/Thryonomyidae and Sciuridae, a finding which is concordant with studies of rodent fetal membranes as well as reproductive and other anatomical features.   相似文献   

7.

Background  

Sequence data analyses such as gene identification, structure modeling or phylogenetic tree inference involve a variety of bioinformatics software tools. Due to the heterogeneity of bioinformatics tools in usage and data requirements, scientists spend much effort on technical issues including data format, storage and management of input and output, and memorization of numerous parameters and multi-step analysis procedures.  相似文献   

8.
A major assumption of many molecular phylogenetic methods is the homogeneity of nucleotide frequencies among taxa, which refers to the equality of the nucleotide frequency bias among species. Changes in nucleotide frequency among different lineages in a data set are thought to lead to erroneous phylogenetic inference because unrelated clades may appear similar because of evolutionarily unrelated similarities in nucleotide frequencies. We tested the effects of the heterogeneity of nucleotide frequency bias on phylogenetic inference, along with the interaction between this heterogeneity and stratified taxon sampling, by means of computer simulations using evolutionary parameters derived from genomic databases. We found that the phylogenetic trees inferred from data sets simulated under realistic, observed levels of heterogeneity for mammalian genes were reconstructed with accuracy comparable to those simulated with homogeneous nucleotide frequencies; the results hold for Neighbor-Joining, minimum evolution, maximum parsimony, and maximum-likelihood methods. The LogDet distance method, specifically designed to deal with heterogeneous nucleotide frequencies, does not perform better than distance methods that assume substitution pattern homogeneity among sequences. In these specific simulation conditions, we did not find a significant interaction between phylogenetic accuracy and substitution pattern heterogeneity among lineages, even when the taxon sampling is increased.  相似文献   

9.
Currently available methods for model selection used in phylogenetic analysis are based on an initial fixed-tree topology. Once a model is picked based on this topology, a rigorous search of the tree space is run under that model to find the maximum-likelihood estimate of the tree (topology and branch lengths) and the maximum-likelihood estimates of the model parameters. In this paper, we propose two extensions to the decision-theoretic (DT) approach that relax the fixed-topology restriction. We also relax the fixed-topology restriction for the Bayesian information criterion (BIC) and the Akaike information criterion (AIC) methods. We compare the performance of the different methods (the relaxed, restricted, and the likelihood-ratio test [LRT]) using simulated data. This comparison is done by evaluating the relative complexity of the models resulting from each method and by comparing the performance of the chosen models in estimating the true tree. We also compare the methods relative to one another by measuring the closeness of the estimated trees corresponding to the different chosen models under these methods. We show that varying the topology does not have a major impact on model choice. We also show that the outcome of the two proposed extensions is identical and is comparable to that of the BIC, Extended-BIC, and DT. Hence, using the simpler methods in choosing a model for analyzing the data is more computationally feasible, with results comparable to the more computationally intensive methods. Another outcome of this study is that earlier conclusions about the DT approach are reinforced. That is, LRT, Extended-AIC, and AIC result in more complicated models that do not contribute to the performance of the phylogenetic inference, yet cause a significant increase in the time required for data analysis.  相似文献   

10.
11.
The basal relationship of bryophytes and tracheophytes is problematic in land plant phylogeny. In addition to cladistic analyses of morphological data, molecular phylogenetic analyses of the nuclear small-subunit ribosomal RNA gene and the plastic gene rbcL have been performed, but no confident conclusions have been reached. Using the maximum-likelihood (ML) method, we analyzed 4,563 bp of aligned sequences from plastid protein-coding genes and 1,680 bp from the nuclear 18S rRNA gene. In the ML tree of deduced amino acid sequences of the plastid genes, hornworts were basal among the land plants, while mosses and liverworts each formed a clade and were sister to each other. Total-evidence evaluation of rRNA data and plastid protein-coding genes by TOTALML had an almost identical result.  相似文献   

12.
Discriminating phylogenetic signal from noise in DNA sequence data is a difficult problem in phylogenetic inference at higher systematic levels. For protein-coding genes, noise at synonymous (silent) positions can be filtered by deleting entire codon positions or types of change at a codon position. This method is not appropriate for replacement sites, because changes at each site within a codon may not be independent. This research presents a method using information from protein structure to evaluate variation in replacement sites. Analysis of the correlation of amino acid variation with protein structure identified rapidly evolving codons in the COIII gene. In a series of phylogenetic analyses attempting to recover a known set of vertebrate relationships, downweighting these labile codons produced the most accurate results. Structural correlates of variable and invariant residues identified in this study can be used to increase the accuracy of models used for phylogenetic inference. Viewing amino acid variation within a phylogenetic framework provided insight into residue changes important in the secondary and tertiary structures of the molecule, changes that were correlated between pairs of neighboring residues or between residues in neighboring helices.   相似文献   

13.
Phylogenetic reconstructions are a major component of many studies in evolutionary biology, but their accuracy can be reduced under certain conditions. Recent studies showed that the convergent evolution of some phenotypes resulted from recurrent amino acid substitutions in genes belonging to distant lineages. It has been suggested that these convergent substitutions could bias phylogenetic reconstruction toward grouping convergent phenotypes together, but such an effect has never been appropriately tested. We used computer simulations to determine the effect of convergent substitutions on the accuracy of phylogenetic inference. We show that, in some realistic conditions, even a relatively small proportion of convergent codons can strongly bias phylogenetic reconstruction, especially when amino acid sequences are used as characters. The strength of this bias does not depend on the reconstruction method but varies as a function of how much divergence had occurred among the lineages prior to any episodes of convergent substitutions. While the occurrence of this bias is difficult to predict, the risk of spurious groupings is strongly decreased by considering only 3rd codon positions, which are less subject to selection, as long as saturation problems are not present. Therefore, we recommend that, whenever possible, topologies obtained with amino acid sequences and 3rd codon positions be compared to identify potential phylogenetic biases and avoid evolutionarily misleading conclusions.  相似文献   

14.
Phylogenetic analyses of 110 serpin protein sequences revealed clades consistent with independent phylogenetic analyses based on exon-intron structure and diagnostic amino acid sites. Trees were estimated by maximum likelihood, neighbor joining, and partial split decomposition using both the BLOSUM 62 and Jones-Taylor-Thornton substitution matrices. Neighbor-joining trees gave results closest to those based on independent analyses using genomic and chromosomal data. The maximum-likelihood trees derived using the quartet puzzling algorithm were very conservative, producing many small clades that separated groups of proteins that other results suggest were related. Independent analyses based on exon-intron structure suggested that a neighbor-joining tree was more accurate than maximum-likelihood trees obtained using the quartet puzzling algorithm.  相似文献   

15.
Short interspersed nuclear elements (SINEs) have been used to generate unambiguous phylogenetic topologies relating eukaryotic taxa. The irreversible nature of SINE retroposition is supported by a large body of comparative genome data and is a fundamental assumption inherent in the value of this qualitative method of inference. Here, we assess the key assumption of unidirectional SINE insertion by comparing the SINE insertion-derived topology and the phylogenetic tree based on seven independent loci of five taxa in the order Cetartiodactyla (Cetacea + Artiodactyla). The data sets and analyses were largely independent, but the loci were, by definition, linked, and thus their consistency supported an irreversible pattern of SINE retroposition. Moreover, our analyses of the flanking sequences provided estimates of divergence times among cetartiodactyl lineages unavailable from SINE insertion analysis alone. Unexpected rate heterogeneity among sites of SINE-flanking sequences and other noncoding DNA sequences were observed. Sequence simulations suggest that this rate heterogeneity may be an artifact resulting from the inaccuracies of the substitution model used.  相似文献   

16.
ki ctes over whether molecular sequence data should be partitioned for phylogenetic analysis often confound two types of heterogeneity among partitions. We distinguish historical heterogeneity (i.e., different partitions have different evolutionary relationships) from dynamic heterogeneity (i.e., different partitions show different patterns of sequence evolution) and explore the impact of the latter on phylogenetic accuracy and precision with a two-gene, mitochondrial data set for cranes. The well-established phylogeny of cranes allows us to contrast tree-based estimates of relevant parameter values with estimates based on pairwise comparisons and to ascertain the effects of incorporating different amounts of process information into phylogenetic estimates. We show that codon positions in the cytochrome b and NADH dehydrogenase subunit 6 genes are dynamically heterogenous under both Poisson and invariable-sites + gamma-rates versions of the F84 model and that heterogeneity includes variation in base composition and transition bias as well as substitution rate. Estimates of transition-bias and relative-rate parameters from pairwise sequence comparisons were comparable to those obtained as tree-based maximum likelihood estimates. Neither rate-category nor mixed-model partitioning strategies resulted in a loss of phylogenetic precision relative to unpartitioned analyses. We suggest that weighted-average distances provide a computationally feasible alternative to direct maximum likelihood estimates of phylogeny for mixed-model analyses of large, dynamically heterogenous data sets.  相似文献   

17.
Phylogenetic relationships were studied based on DNA sequences obtained from all recognized genera of the family Corvidae sensu stricto . The aligned data set consists 2589 bp obtained from one mitochondrial and two nuclear genes. Maximum parsimony, maximum-likelihood, and Bayesian inference analyses were used to estimate phylogenetic relationships. The analyses were done for each gene separately, as well as for all genes combined. An analysis of a taxonomically expanded data set of cytochrome b sequences was performed in order to infer the phylogenetic positions of six genera for which nuclear genes could not be obtained. Monophyly of the Corvidae is supported by all analyses, as well as by the occurrence of a deletion of 16 bp in the β-fibrinogen intron in all ingroup taxa. Temnurus and Pyrrhocorax are placed as the sister group to all other corvids, while Cissa and Urocissa appear as the next clade inside them. Further up in the tree, two larger and well-supported clades of genera were recovered by the analyses. One has an entirely New World distribution (the New World jays), while the other includes mostly Eurasian (and one African) taxa. Outside these two major clades are Cyanopica and Perisoreus whose phylogenetic positions could not be determined by the present data. A biogeographic analysis of our data suggests that the Corvidae underwent an initial radiation in Southeast Asia. This is consistent with the observation that almost all basal clades in the phylogenetic tree consist of species adapted to tropical and subtropical forest habitats.  相似文献   

18.
Summary Several forms of maximum likelihood models are applied to aligned amino acid sequence data coded for in the mitochondrial DNA of six species (chicken, frog, human, bovine, mouse, and rat). These models range in form from relatively simple models of the type currently used for inferring phylogenetic tree structure to models more complex than those that have been used previously. No major discrepancies between the optimal trees inferred by any of these methods are found, but there are huge differences in adequacy of fit. A very significant finding is that the fit of any of these models is vastly improved by allowing a certain proportion of the amino acid sites to be invariant. An even more important, although disquieting, finding is that none of these models fits well, as judged by standard statistical criteria. The primary reason for this is that amino acid sites undergo substitution according to a process that is very heterogeneous. Because most phylogenetic inference is accomplished by choosing the optimal tree under the assumption that a homogeneous process is acting on the sites, the potential invalidity of some such conclusions is raised by this article's results. The seriousness of this problem depends upon the robustness of the phylogenetic inferential procedure to departures from the underlying model.  相似文献   

19.
Functional shifts during protein evolution are expected to yield shifts in substitution rate, and statistical methods can test for this at both codon and amino acid levels. Although methods based on models of sequence evolution serve as powerful tools for studying evolutionary processes, violating underlying assumptions can lead to false biological conclusions. It is not unusual for functional shifts to be accompanied by changes in other aspects of the evolutionary process, such as codon or amino acid frequencies. However, models used to test for functional divergence assume these frequencies remain constant over time. We employed simulation to investigate the impact of non-stationary evolution on functional divergence inference. We investigated three likelihood ratio tests based on codon models and found varying degrees of sensitivity. Joint effects of shifts in frequencies and selection pressures can be large, leading to false signals for positive selection. Amino acid-based tests (FunDi and Bivar) were also compromised when several aspects of the substitution process were not adequately modeled. We applied the same tests to a core genome “scan” for functional divergence between light-adapted ecotypes of the cyanobacteria Prochlorococcus, and carried out gene-specific simulations for ten genes. Results of those simulations illustrated how the inference of functional divergence at the genomic level can be seriously impacted by model misspecification. Although computationally costly, simulations motivated by data in hand are warranted when several aspects of the substitution process are either misspecified or not included in the models upon which the statistical tests were built.  相似文献   

20.
We tested whether it is beneficial for the accuracy of phylogenetic inference to sample characters that are evolving under different sets of parameters, using both Bayesian MCMC (Markov chain Monte Carlo) and parsimony approaches. We examined differential rates of evolution among characters, differential character-state frequencies and character-state space, and differential relative branch lengths among characters. We also compared the relative performance of parsimony and Bayesian analyses by progressively incorporating more of these heterogeneous parameters and progressively increasing the severity of this heterogeneity. Bayesian analyses performed better than parsimony when heterogeneous simulation parameters were incorporated into the substitution model. However, parsimony outperformed Bayesian MCMC when heterogeneous simulation parameters were not incorporated into the Bayesian substitution model. The higher the rate of evolution simulated, the better parsimony performed relative to Bayesian analyses. Bayesian and parsimony analyses converged in their performance as the number of simulated heterogeneous model parameters increased. Up to a point, rate heterogeneity among sites was generally advantageous for phylogenetic inference using both approaches. In contrast, branch-length heterogeneity was generally disadvantageous for phylogenetic inference using both parsimony and Bayesian approaches. Parsimony was found to be more conservative than Bayesian analyses, in that it resolved fewer incorrect clades.
© The Willi Hennig Society 2006.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号