首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Z. Yang  S. Kumar    M. Nei 《Genetics》1995,141(4):1641-1650
A statistical method was developed for reconstructing the nucleotide or amino acid sequences of extinct ancestors, given the phylogeny and sequences of the extant species. A model of nucleotide or amino acid substitution was employed to analyze data of the present-day sequences, and maximum likelihood estimates of parameters such as branch lengths were used to compare the posterior probabilities of assignments of character states (nucleotides or amino acids) to interior nodes of the tree; the assignment having the highest probability was the best reconstruction at the site. The lysozyme c sequences of six mammals were analyzed by using the likelihood and parsimony methods. The new likelihood-based method was found to be superior to the parsimony method. The probability that the amino acids for all interior nodes at a site reconstructed by the new method are correct was calculated to be 0.91, 0.86, and 0.73 for all, variable, and parsimony-informative sites, respectively, whereas the corresponding probabilities for the parsimony method were 0.84, 0.76, and 0.51, respectively. The probability that an amino acid in an ancestral sequence is correctly reconstructed by the likelihood analysis ranged from 91.3 to 98.7% for the four ancestral sequences.  相似文献   

2.
Developmental biology often yields data in a temporal context. Temporal data in phylogenetic systematics has important uses in the field of evolutionary developmental biology and, in general, comparative biology. The evolution of temporal sequences, specifically developmental sequences, has proven difficult to examine due to the highly variable temporal progression of development. Issues concerning the analysis of temporal sequences and problems with current methods of analysis are discussed. We present here an algorithm to infer ancestral temporal sequences, quantify sequence heterochronies, and estimate pseudoreplicate consensus support for sequence changes using Parsimov-based genetic inference [PGi]. Real temporal developmental sequence data sets are used to compare PGi with currently used approaches, and PGi is shown to be the most efficient, accurate, and practical method to examine biological data and infer ancestral states on a phylogeny. The method is also expandable to address further issues in developmental evolution, namely modularity.  相似文献   

3.
Summary The course of evolutionary change in DNA sequences has been modeled as a Markov process. The Markov process was represented by discrete time matrix methods. The parameters of the Markov transition matrices were estimated by least-squares direct-search optimization of the fit of the calculated divergence matrix to that observed for two aligned sequences. The Markov process corrected for multiple and parallel substitutions of bases at the same site. The method avoided the incorrect assumption of all previously described methods that the divergence between two present-day sequences is twice the divergence of either from the common and unknown ancestral sequence. The three previous methods were shown to be equivalent. The present method also avoided the undesirable assumptions that sequence composition has not changed with time and that the substitution rates in the two descendant lineages were the same. It permitted simultaneous estimation of ancestral sequence composition and, if applicable, of different substitution rates for the two descendant lineages, provided the total number of estimated parameters was less than 16. Properties of the Markov chain were discussed. It was proved for symmetric substitution matrices that all elements of the equilibrium divergence matrix equal 1/16, and that the total difference in the divergence matrix at epoch k equals the total change in the common substitution matrix at epoch 2k for all values of k. It was shown how to resolve an ambiguity in the assignment of two different substitution rates to the two descendant lineages when four or more similar sequences are available. The method was applied to the divergence matrix for codon site 3 for the mouse and rabbit beta-globins. This observed divergence matrix was significantly asymmetric and required at least two different substitution rates. This result could be achieved only by using different asymmetric substitution matrices for the two lineages.  相似文献   

4.
A 3.1-kb intergenic DNA fragment located between the psi beta-globin and delta-globin genes in the beta-globin gene cluster was cloned from gorilla, orangutan, rhesus monkey, and spider monkey, and the nucleotide sequence of each fragment was determined. The phylogeny of these four sequences, together with two previously published allelic sequences from humans and one from chimpanzee, was constructed, and the accumulation of mutations in the region was analyzed. The sites of base substitutions are not evenly distributed within the region: two Alu repeats have accumulated 0.21 + 0.02 substitutions/site with 0.15 + 0.008 substitutions/site in the remainder of the fragment. The occurrence of substitutions at neighboring sites is more frequent than would be expected if they were independent. The observed excesses disappear when ancestral -CG- dinucleotide sites are excluded. The phylogenetic relationships of the sequences indicate that the human sequence shares a most recent coancestor with the chimpanzee sequence. The data also show that great apes have accumulated fewer mutations in this part of the genome than has the rhesus monkey. The relative rates of accumulation of 12 kinds of nucleotide substitution in the region during primate evolution are asymmetric in the DNA strands. From these rates of accumulation, the origin of a simple stretch of sequence near the 3' end of the 3.1-kb fragment was deduced to be a sequence comprising 50% T and 50% C on one strand. The two oppositely oriented Alu sequences in the 3.1-kb region were inserted at their present positions before the divergence of the New-World monkeys from other lineages. Our analysis shows that the nucleotide sequences of the two Alu repeats in spider monkey are unexpectedly similar both to each other and to the deduced ancestral sequence of Alu repeats. The data suggest that there has been some type of recombinational event between the spider monkey Alu repeats but that it was not a simple gene conversion.   相似文献   

5.
A known phylogeny was generated using a four-step serial bifurcate PCR method. The ancestor sequence (SSU rDNA) evolved in vitro for 280 nested PCR cycles, and the resulting 15 ancestor and 16 terminal sequences (2,238 bp each) were determined. Parsimony, distance, and maximum likelihood analysis of the terminal sequences reconstructed the topology of the real phylogeny and branch lengths accurately. Divergence dates and ancestor sequences were estimated with very small error, particularly at the base of the phylogeny, mostly due to insertion and deletion changes. The substitution patterns along the known phylogeny are not described by reversible models, and accordingly, the probability substitution matrix, based on the observed substitutions from ancestor to terminal nodes along the known phylogeny, was calculated. This approach is an extension of previous studies using bacteriophage serial propagation, because here mutations were allowed to occur neutrally rather than by addition of a mutagenic agent, which produced biased mutational changes. These results provide for the first time biochemical experimental support for phylogenies, divergence date estimates, and an irreversible substitution model based on neutrally evolving DNA sequences. The substitution preferences observed here (A to G and T to C) are consistent with the high G+C content of the Thermus aquaticus genome. This suggests, at least in part, that the method here described, which explores the high Taq DNA polymerase error rate, simulates the evolution of a DNA segment in a thermophilic organism. These organisms include the bacterial rod T. aquaticus and several Archaea, and thus, the method and data set described here may well contribute new insights about the genome evolution of these organisms.  相似文献   

6.
Reconstructing phylogeny is a crucial target of contemporary biology, now commonly approached through computerized analysis of genetic sequence data. In angiosperms, despite recent progress at the ordinal level, many relationships between families remain unclear. Here we take a case study from Lamiales, an angiosperm order in which interfamilial relationships have so far proved particularly problematic. We examine the effect of changing one factor-the quantity of sequence data analyzed-on phylogeny reconstruction in this group. We use simulation to estimate a priori the sequence data that would be needed to resolve an accurate, supported phylogeny of Lamiales. We investigate the effect of increasing the length of sequence data analyzed, the rate of substitution in the sequences used, and of combining gene partitions. This method could be a valuable technique for planning systematic investigations in other problematic groups. Our results suggest that increasing sequence length is a better way to improve support, resolution, and accuracy than employing sequences with a faster substitution rate. Indeed, the latter may in some cases have detrimental effects on phylogeny reconstruction. Further molecular sequencing-of at least 10,000 bp-should result in a fully resolved and supported phylogeny of Lamiales, but at present the problematic aspects of this tree model remain.  相似文献   

7.
A nonhomogeneous, nonstationary stochastic model of DNA sequence evolution allowing varying equilibrium G + C contents among lineages is devised in order to deal with sequences of unequal base compositions. A maximum-likelihood implementation of this model for phylogenetic analyses allows handling of a reasonable number of sequences. The relevance of the model and the accuracy of parameter estimates are theoretically and empirically assessed, using real or simulated data sets. Overall, a significant amount of information about past evolutionary modes can be extracted from DNA sequences, suggesting that process (rates of distinct kinds of nucleotide substitutions) and pattern (the evolutionary tree) can be simultaneously inferred. G + C contents at ancestral nodes are quite accurately estimated. The new method appears to be useful for phylogenetic reconstruction when base composition varies among compared sequences. It may also be suitable for molecular evolution studies.   相似文献   

8.
We present a molecular phylogeny for the genus Hemileuca (Saturniidae), based on 624 bp of mitochondrial cytochrome oxidase I (COI) and 932 bp of the nuclear gene elongation factor 1 alpha (EF1alpha). Combined analysis of both gene sequences increased resolution and supported most of the phylogenetic relationships suggested by separate analysis of each gene. However, a maximum parsimony (MP) model for just COI sequence from one sample of most taxa produced a phylogeny incongruent with EF1alpha and combined dataset analyses under either MP or ML models. Time of year and time of day during which adult moths fly corresponded strongly with the phylogeny. Although most Hemileuca are diurnal, ancestral Hemileuca probably were nocturnal, fall-flying insects. The two-gene molecular phylogeny suggests that wing morphology is frequently homoplastic. There was no correlation between the primary larval hostplants and phylogenetic placement of taxa. No phylogenetic pattern of specialization was evident for single hostplant families across the genus. Our results suggest that phenological behavioral characters may be more conserved than the wing morphology characters that are more commonly used to infer phylogenetic relationships in Lepidoptera. Inclusion of a molecular component in the re-evaluation of systematic data is likely to alter prior assumptions of phylogenetic relationships in groups where such potentially homoplastic characters have been used.  相似文献   

9.
Fixed Character States and the Optimization of Molecular Sequence Data   总被引:5,自引:1,他引:5  
A method is proposed to optimize molecular sequence data that does not employ multiple sequence alignment. This method treats entire homologous contiguous stretches of sequence data as individual characters. This sequence is treated as the homologous unit employed in phylogeny reconstruction. The sets of specific sequences exhibited by the terminal taxa constitute the character states. The number of states is then less than or equal to the number of unique sequences (or homologous fragments) exhibited by the data. A matrix of transformation costs is created to relate the states to one another. The cells of this matrix are defined as the minimum transformation cost between each pair of states based on insertion–deletion and base substitution costs. The diagnosis of a topology then follows existing dynamic programming techniques, with the number of states greatly expanded. Since the possible sequences reconstructed at nodes are limited to those exhibited by the terminals, cladograms constructed in this way may be longer than those of other methods in that they require a greater number of weighted evolutionary events. Example data, the effects of missing data, restricted ancestors, and putative long-branch attraction are discussed.  相似文献   

10.
Ren F  Tanaka H  Yang Z 《Systematic biology》2005,54(5):808-818
Models of codon substitution have been commonly used to compare protein-coding DNA sequences and are particularly effective in detecting signals of natural selection acting on the protein. Their utility in reconstructing molecular phylogenies and in dating species divergences has not been explored. Codon models naturally accommodate synonymous and nonsynonymous substitutions, which occur at very different rates and may be informative for recent and ancient divergences, respectively. Thus codon models may be expected to make an efficient use of phylogenetic information in protein-coding DNA sequences. Here we applied codon models to 106 protein-coding genes from eight yeast species to reconstruct phylogenies using the maximum likelihood method, in comparison with nucleotide- and amino acid-based analyses. The results appeared to confirm that expectation. Nucleotide-based analysis, under simplistic substitution models, were efficient in recovering recent divergences whereas amino acid-based analysis performed better at recovering deep divergences. Codon models appeared to combine the advantages of amino acid and nucleotide data and had good performance at recovering both recent and deep divergences. Estimation of relative species divergence times using amino acid and codon models suggested that translation of gene sequences into proteins led to information loss of from 30% for deep nodes to 66% for recent nodes. Although computational burden makes codon models unfeasible for tree search in large data sets, we suggest that they may be useful for comparing candidate trees. Nucleotide models that accommodate the differences in evolutionary dynamics at the three codon positions also performed well, at much less computational cost. We discuss the relationship between a model's fit to data and its utility in phylogeny reconstruction and caution against use of overly complex substitution models.  相似文献   

11.
Sequence analysis of a polymorphic Mhc class II gene in Pacific salmon   总被引:1,自引:0,他引:1  
Polymorphism of the nucleotide sequences encoding 149 amino acids of linked major histocompatibility complex (Mhc) class II 131 and 132 peptides, and of the intervening intron (548–773 base pairs), was examined within and among seven Pacific salmon (Oncorhynchus) species. Levels of nucleotide diversity were higher for theB1 sequence than forB2 or the intron in comparisons both within and between species. For the codons of the peptide binding region of the BI sequence, the level of nonsynonymous nucleotide substitution (dN) exceeded the level of synonymous substitution (dS) by a factor of ten for within-species comparisons, and by a factor of four for between-species comparisons. The excess of dN indicates that balancing selection maintains diversity at this salmonidMhc class II locus, as is common forMhc loci in other vertebrates. Levels of nucleotide diversity for both the exon and intron sequences were greater among than within species, and there were numerous species-specific nucleotides present in both the coding and noncoding regions. Thus, neighbor-joining analysis of both the intron and exon regions provided phylogenies in which the sequences clustered strongly by species. There was little evidence of shared ancestral (trans-species) polymorphism in the exon phylogeny, and the intron phylogeny depicted standard relationships among the Pacific salmon species. The lack of shared allelicB1 lineages in these closely related species may result from severe bottlenecks that occurred during speciation or during the ice ages that glaciated the rim of the north Pacific Ocean approximately every 100 000 years in the Pleistocene.The nucleotide sequence data reported in this paper have been submitted to the GenBank nucleotide sequence database and have been assigned the accession numbers U34692-U34720  相似文献   

12.
The present study illustrates a method for analysing the biogeography of a group that is based on the group's phylogeny but does not invoke founder dispersal or centre of origin. The case studies presented include groups from many different parts of the world, but most are from the south‐west Pacific. The idea that basal groups are ancestral is not valid as a generalization. Neither the basal group, nor the oldest fossil represents the centre of origin, the time of origin or the ancestral ecology. Basal groups comprise less diverse sister groups and their distributions occur around centres of differentiation in already widespread ancestors, and not centres of origin for the whole group. Thus, the sequence of nodes in a phylogeny may indicate the spatial sequence of differentiation in a widespread ancestor rather than a series of founder dispersal events. Allocation of clades to a priori geographic areas, such as the continents, in the initial stages of biogeographic analysis has often involved incorrect assumptions of sympatry. This has led to the idea that the ‘areas of sympatry’ were centres of origin. Areas other than those defined by the taxa themselves need not be used in analysis. The fossil‐calibrated molecular clock, with dates transmogrified from minimum to maximum dates, has been used to test for vicariance. Recent work in population genetics, however, indicates that allopatry is caused by vicariance rather than founder dispersal, and so vicariance can instead be used to test the clock. Deriving evolutionary chronology by calibrating spatial vicariance in molecular clades with associated tectonic events is more reasonable than relying on the fossil record to give maximum (absolute) dates. © 2009 The Linnean Society of London, Biological Journal of the Linnean Society, 2009, 98 , 757–774.  相似文献   

13.
Akashi H  Goel P  John A 《PloS one》2007,2(10):e1065
Reliable inference of ancestral sequences can be critical to identifying both patterns and causes of molecular evolution. Robustness of ancestral inference is often assumed among closely related species, but tests of this assumption have been limited. Here, we examine the performance of inference methods for data simulated under scenarios of codon bias evolution within the Drosophila melanogaster subgroup. Genome sequence data for multiple, closely related species within this subgroup make it an important system for studying molecular evolutionary genetics. The effects of asymmetric and lineage-specific substitution rates (i.e., varying levels of codon usage bias and departures from equilibrium) on the reliability of ancestral codon usage was investigated. Maximum parsimony inference, which has been widely employed in analyses of Drosophila codon bias evolution, was compared to an approach that attempts to account for uncertainty in ancestral inference by weighting ancestral reconstructions by their posterior probabilities. The latter approach employs maximum likelihood estimation of rate and base composition parameters. For equilibrium and most non-equilibrium scenarios that were investigated, the probabilistic method appears to generate reliable ancestral codon bias inferences for molecular evolutionary studies within the D. melanogaster subgroup. These reconstructions are more reliable than parsimony inference, especially when codon usage is strongly skewed. However, inference biases are considerable for both methods under particular departures from stationarity (i.e., when adaptive evolution is prevalent). Reliability of inference can be sensitive to branch lengths, asymmetry in substitution rates, and the locations and nature of lineage-specific processes within a gene tree. Inference reliability, even among closely related species, can be strongly affected by (potentially unknown) patterns of molecular evolution in lineages ancestral to those of interest.  相似文献   

14.
H Mannen  S Tsuji  R T Loftus  D G Bradley 《Genetics》1998,150(3):1169-1175
This article describes complete mitochondrial DNA displacement loop sequences from 32 Japanese Black cattle and the analysis of these data in conjunction with previously published sequences from African, European, and Indian subjects. The origins of North East Asian domesticated cattle are unclear. The earliest domestic cattle in the region were Bos taurus and may have been domesticated from local wild cattle (aurochsen; B. primigenius), or perhaps had an origin in migrants from the early domestic center of the Near East. In phylogenetic analyses, taurine sequences form a dense tree with a center consisting of intermingled European and Japanese sequences with one group of Japanese and another of all African sequences, each forming distinct clusters at extremes of the phylogeny. This topology and calibrated levels of sequence divergence suggest that the clusters may represent three different strains of ancestral aurochs, adopted at geographically and temporally separate stages of the domestication process. Unlike Africa, half of Japanese cattle sequences are topologically intermingled with the European variants. This suggests an interchange of variants that may be ancient, perhaps a legacy of the first introduction of domesticates to East Asia.  相似文献   

15.
Hypothesized relationships between ontogenetic and phylogenetic change in morphological characters were empirically tested in centrarchid fishes by comparing observed patterns of character development with patterns of character evolution as inferred from a representative phylogenetic hypothesis. This phylogeny was based on 56–61 morphological characters that were polarized by outgroup comparison. Through these comparisons, evolutionary changes in character ontogeny were categorized in one of eight classes (terminal addition, terminal deletion, terminal substitution, non-terminal addition, non-terminal deletion, non-terminal substitution, ontogenetic reversal and substitution). The relative frequencies of each of these classes provided an empirical basis from which assumptions underlying hypothesized relationships between ontogeny and phylogeny were tested. In order to test hypothesized relationships between ontogeny and phylogeny that involve assumptions about the relative frequencies of terminal change (e.g. the use of ontogeny as a homology criterion), two additional phylogenies were generated in which terminal addition and terminal deletion were maximized and minimized for all characters. Character state change interpreted from these phylogenies thus represents the maxima and minima of the frequency range of terminal addition and terminal deletion for the 8.7 × 1036 trees possible for centrarchids. It was found for these data that terminal change accounts for c. 75% of the character state change. This suggests either that early ontogeny is conserved in evolution or that interpretation and classification of evolutionary changes in ontogeny is biased in part by the way that characters are recognized, delimited and coded. It was found that ontogenetic interpretation is influenced by two levels of homology decision: an initial decision involving delimitation of the character (the ontogenetic sequence), and the subsequent recognition of homologous components of developmental sequences. Recognition of phylogenetic homology among individual components of developmental sequences is necessary for interpretation of evolutionary changes in ontogeny as either terminal or non-terminal. If development is the primary criterion applied in recognizing individual homologies among parts of ontogenetic sequences, the only possible interpretation of phylogenetic differences is that of terminal change. If homologies of the components cannot be ascertained, recognition of the homology of the developmental sequence as a whole will result in the interpretation of evolutionary differences as substitutions. Particularly when the objective of a study is to discover how ontogeny has evolved, criteria in addition to ontogeny must be used to recognize homology. Interpretation is also dependent upon delimitation within an ontogenetic sequence. This is in part a function of the way that an investigator ‘sees’ and codes characters. Binary and multistate characters influence interpretation differently and predictably. The use of ontogeny for determining phylogenetic polarity as previously proposed rests on the assumptions that ancestral ontogenies are conserved and that character evolution occurs predominantly through terminal addition. It was found for these data that terminal addition may comprise a maximum of 51.9% of the total character state change. It is concluded that the ontogenetic criterion is not a reliable indicator of phylogenetic polarity. Process and pattern data are collected simultaneously by those engaged in comparative morphological studies of development. The set of alternative explanatory processes is limited in the process of observing development. These form necessary starting points for the research of developmental biologists. Separating ‘empirical’ results from interpretational influences requires awareness of potential biases in the course of character selection, coding and interpretation. Consideration of the interpretational problems involved in identifying and classifying phylogenetic changes in ontogeny leads to a re-evaluation of the purpose, usefulness and information conveyed by the current classification system. It is recommended that alternative classification schemes be pursued.  相似文献   

16.
DNA在鸟类分子系统发育研究中的应用   总被引:1,自引:0,他引:1  
马玉堃  牛黎明  国会艳 《遗传》2006,28(1):97-104
鸟类分子系统发育研究中常用的DNA技术有DNA杂交、RFLP和DNA序列分析等。DNA杂交技术曾在鸟类中有过大规模的应用,并由此诞生了一套新的鸟类分类系统。在鸟类的RFLP分析中,用的最多的靶序列是线粒体DNA。DNA序列分析技术被认为是进行分子系统发育研究最有效、最可靠的方法。在DNA序列分析中,线粒体基因应用最广泛,但由于其自身的一些不足,近年来,不少学者把目光投向了核基因,将线粒体基因和核基因结合起来进行系统发育研究。目前在鸟类分子系统发育中,应用较多的核基因是scnDNA,其内含子可以用于中等阶元水平的系统研究,而外显子主要用于高等阶元的系统研究。除了分子标记自身的问题之外,鸟类分子系统发育研究中还存在着方法上的问题,包括分子标记的选择,样本数量以及数据处理等。今后鸟类分子系统发育研究应该更加注重方法的标准化。  相似文献   

17.
Bayesian estimation of ancestral character states on phylogenies   总被引:17,自引:0,他引:17  
Biologists frequently attempt to infer the character states at ancestral nodes of a phylogeny from the distribution of traits observed in contemporary organisms. Because phylogenies are normally inferences from data, it is desirable to account for the uncertainty in estimates of the tree and its branch lengths when making inferences about ancestral states or other comparative parameters. Here we present a general Bayesian approach for testing comparative hypotheses across statistically justified samples of phylogenies, focusing on the specific issue of reconstructing ancestral states. The method uses Markov chain Monte Carlo techniques for sampling phylogenetic trees and for investigating the parameters of a statistical model of trait evolution. We describe how to combine information about the uncertainty of the phylogeny with uncertainty in the estimate of the ancestral state. Our approach does not constrain the sample of trees only to those that contain the ancestral node or nodes of interest, and we show how to reconstruct ancestral states of uncertain nodes using a most-recent-common-ancestor approach. We illustrate the methods with data on ribonuclease evolution in the Artiodactyla. Software implementing the methods (BayesMultiState) is available from the authors.  相似文献   

18.
L. Vawter  W. M. Brown 《Genetics》1993,134(2):597-608
The small subunit ribosomal RNA gene (srDNA) has been used extensively for phylogenetic analyses. One common assumption in these analyses is that substitution rates are biased toward transitions. We have developed a simple method for estimating relative rates of base change that does not assume rate constancy and takes into account base composition biases in different structures and taxa. We have applied this method to srDNA sequences from taxa with a noncontroversial phylogeny to measure relative rates of evolution in various structural regions of srRNA and relative rates of the different transitions and transversions. We find that: (1) the long single-stranded regions of the RNA molecule evolve slowest, (2) biases in base composition associated with structure and phylogenetic position exist, and (3) the srDNAs studied lack a consistent transition/transversion bias. We have made suggestions based on these findings for refinement of phylogenetic analyses using srDNA data.  相似文献   

19.
Felsenstein's maximum-likelihood approach for inferring phylogeny from DNA sequences assumes that the rate of nucleotide substitution is constant over different nucleotide sites. This assumption is sometimes unrealistic, as has been revealed by analysis of real sequence data. In the present paper Felsenstein's method is extended to the case where substitution rates over sites are described by the gamma distribution. A numerical example is presented to show that the method fits the data better than do previous models.   相似文献   

20.
Liu L  Pearl DK 《Systematic biology》2007,56(3):504-514
The desire to infer the evolutionary history of a group of species should be more viable now that a considerable amount of multilocus molecular data is available. However, the current molecular phylogenetic paradigm still reconstructs gene trees to represent the species tree. Further, commonly used methods of combining data, such as the concatenation method, are known to be inconsistent in some circumstances. In this paper, we propose a Bayesian hierarchical model to estimate the phylogeny of a group of species using multiple estimated gene tree distributions, such as those that arise in a Bayesian analysis of DNA sequence data. Our model employs substitution models used in traditional phylogenetics but also uses coalescent theory to explain genealogical signals from species trees to gene trees and from gene trees to sequence data, thereby forming a complete stochastic model to estimate gene trees, species trees, ancestral population sizes, and species divergence times simultaneously. Our model is founded on the assumption that gene trees, even of unlinked loci, are correlated due to being derived from a single species tree and therefore should be estimated jointly. We apply the method to two multilocus data sets of DNA sequences. The estimates of the species tree topology and divergence times appear to be robust to the prior of the population size, whereas the estimates of effective population sizes are sensitive to the prior used in the analysis. These analyses also suggest that the model is superior to the concatenation method in fitting these data sets and thus provides a more realistic assessment of the variability in the distribution of the species tree that may have produced the molecular information at hand. Future improvements of our model and algorithm should include consideration of other factors that can cause discordance of gene trees and species trees, such as horizontal transfer or gene duplication.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号