首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The statistical properties of sample estimation and bootstrap estimation of phylogenetic variability from a sample of nucleotide sequences are studied by using model trees of three taxa with an outgroup and by assuming a constant rate of nucleotide substitution. The maximum-parsimony method of tree reconstruction is used. An analytic formula is derived for estimating the sequence length that is required if P, the probability of obtaining the true tree from the sampled sequences, is to be equal to or higher than a given value. Bootstrap estimation is formulated as a two-step sampling procedure: (1) sampling of sequences from the evolutionary process and (2) resampling of the original sequence sample. The probability that a bootstrap resampling of an original sequence sample will support the true tree is found to depend on the model tree, the sequence length, and the probability that a randomly chosen nucleotide site is an informative site. When a trifurcating tree is used as the model tree, the probability that one of the three bifurcating trees will appear in > or = 95% of the bootstrap replicates is < 5%, even if the number of bootstrap replicates is only 50; therefore, the probability of accepting an erroneous tree as the true tree is < 5% if that tree appears in > or = 95% of the bootstrap replicates and if more than 50 bootstrap replications are conducted. However, if a particular bifurcating tree is observed in, say, < 75% of the bootstrap replicates, then it cannot be claimed to be better than the trifurcating tree even if > or = 1,000 bootstrap replications are conducted. When a bifurcating tree is used as the model tree, the bootstrap approach tends to overestimate P when the sequences are very short, but it tends to underestimate that probability when the sequences are long. Moreover, simulation results show that, if a tree is accepted as the true tree only if it has appeared in > or = 95% of the bootstrap replicates, then the probability of failing to accept any bifurcating tree can be as large as 58% even when P = 95%, i.e., even when 95% of the samples from the evolutionary process will support the true tree. Thus, if the rate-constancy assumption holds, bootstrapping is a conservative approach for estimating the reliability of an inferred phylogeny for four taxa.  相似文献   

2.
拓扑树间的通经拓扑距离   总被引:1,自引:1,他引:0  
给出了一种新的系统树间的拓扑距离,使用NJ,MP,UPGMA等3种方法对13种动物的线粒体中14个基因(含组合的)DNA序列数据进行系统树的构建,利用分割拓扑距离和本文给出的通经拓扑距离对这14种系统树这间及其与真树进行比较。结果显示,NJ法对获得已知树的有效率最高,MP法次之,UPGMA法最低。这14种DNA序列所构建的系统树与已知树的拓扑距离基本上是随其DNA序列长度增加而减小,但两者的相关系数并未达到显著水平,分割拓扑距离在总体上可反映树间的拓扑结构差异,但其测度精确度比通经拓扑距离要低。  相似文献   

3.
从细胞色素b基因序列探讨笛鲷属的分子系统发生关系   总被引:3,自引:0,他引:3  
测定了9种中国南海的笛鲷属鱼类的细胞色素b基因的部分序列,结合来自GenBank中1种分布于菲律宾和9种分布于美国大西洋的笛鲷属鱼类的相应同源序列,用邻接法和最大简约法构建分子系统树。结果显示:红鳍笛鲷(Lutjanuserythropterus)与红笛鲷(L.sanguineus)之间的同源序列碱基差异百分率只有0.32%,支持二者是同种异名的观点;中国南海的笛鲷属鱼类间的平均碱基差异要高于美国大西洋笛鲷属鱼类。在MP和NJ树中,美国大西洋笛鲷表现为亲缘关系较近,来源于中国南海的笛鲷鱼类相对集中在树的基部,分歧较大。这与所研究的笛鲷地理分布和地理隔离基本相一致,同时也说明中国南海笛鲷分化较早并且分歧较大。  相似文献   

4.
The relative efficiencies of the maximum-parsimony (MP), UPGMA, and neighbor-joining (NJ) methods in obtaining the correct tree (topology) for restriction-site and restriction-fragment data were studied by computer simulation. In this simulation, six DNA sequences of 16,000 nucleotides were assumed to evolve following a given model tree. The recognition sequences of 20 different six-base restriction enzymes were used to identify the restriction sites of the DNA sequences generated. The restriction-site data and restriction-fragment data thus obtained were used to reconstruct a phylogenetic tree, and the tree obtained was compared with the model tree. This process was repeated 300 times. The results obtained indicate that when the rate of nucleotide substitution is constant the probability of obtaining the correct tree (Pc) is generally higher in the NJ method than in the MP method. However, if we use the average topological deviation from the model tree (dT) as the criterion of comparison, the NJ and MP methods are nearly equally efficient. When the rate of nucleotide substitution varies with evolutionary lineage, the NJ method is better than the MP method, whether Pc or dT is used as the criterion of comparison. With 500 nucleotides and when the number of nucleotide substitutions per site was very small, restriction-site data were, contrary to our expectation, more useful than sequence data. Restriction-fragment data were less useful than restriction-site data, except when the sequence divergence was very small. UPGMA seems to be useful only when the rate of nucleotide substitution is constant and sequence divergence is high.  相似文献   

5.
以线粒体细胞色素氧化酶I(COI)基因作分子标记,对线蛱蝶亚科蝴蝶进行序列测定.序列分析的结果表明.经比对和处理后的序列总长度是645bp,其中有199个变异位点,147个简约信息位点;所编码的氨基酸序列中有18个变异位点,7个信息位点.A+T平均含量为69.6%,G+C平均含量为30.4%,碱基组成出现AT偏斜.以蛱蝶亚科及秀蛱蝶亚科物种为外类群,用NJ、MP及贝叶斯法重建了该亚科的系统发生树,探讨了它们主要类群间的系统发生关系.分子系统树显示,线蛱蝶亚科由以下3大支系:环蛱蝶族+翠蛱蝶族、线蛱蝶族、丽蛱蝶族构成;其中,环蛱蝶族为单系群(NJ树也支持线蛱蝶族的单系性);翠蛱蝶族与环蛱蝶族亲缘关系较近:丽蛱蝶族可能是该亚科较早分化出的一支.  相似文献   

6.
基于ITS序列分析豹子花属与5种百合的亲缘关系   总被引:2,自引:0,他引:2  
以滇蜀豹子花和多斑豹子花为材料,采用PCR直接测序法测定其ITS序列,结合GenBank中其它3种豹子花和5种百合的ITS序列,构建了这10种植物的系统发育树.结果表明:(1)10种植物的ITS序列长度在625bp~627 bp之间,总G C含量在60.38%~61.12%之间,5.8S的G C含量除大理百合为45.4%外,其余9种植物为55.01%或54.60%,说明ITS序列在进化上保守性较强,同属不同种甚至不同属间的长度差异不明显;(2)NJ、MP、ME聚类树的分支趋势一致,都是豹子花属植物先聚在一起再和5种百合相聚,滇西豹子花和豹子花在3种聚类树中都以99%以上的支持率聚成一支,说明这2个种的亲缘关系最近;(3)在10种植物中,形态相似且分布海拔和区域重叠的种类先相聚,说明这些物种的亲缘关系密切.  相似文献   

7.
Murphy and colleagues reported that the mammalian phylogeny was resolved by Bayesian phylogenetics. However, the DNA sequences they used had many alignment gaps and undetermined nucleotide sites. We therefore reanalyzed their data by minimizing unshared nucleotide sites and retaining as many species as possible (13 species). In constructing phylogenetic trees, we used the Bayesian, maximum likelihood (ML), maximum parsimony (MP), and neighbor-joining (NJ) methods with different substitution models. These trees were constructed by using both protein and DNA sequences. The results showed that the posterior probabilities for Bayesian trees were generally much higher than the bootstrap values for ML, MP, and NJ trees. Two different Bayesian topologies for the same set of species were sometimes supported by high posterior probabilities, implying that two different topologies can be judged to be correct by Bayesian phylogenetics. This suggests that the posterior probability in Bayesian analysis can be excessively high as an indication of statistical confidence and therefore Murphy et al.'s tree, which largely depends on Bayesian posterior probability, may not be correct.  相似文献   

8.
The chloroplast-encoded atp B gene was sequenced from 33 strains representing 28 species of the colonial Volvocales (the Volvocaceae and its relatives) to reexamine phylogenetic relationships as previously deduced by morphological data and rbc L gene sequence data.1128 base pairs in the coding regions of the atp B gene were analyzed by MP, NJ, and ML analyses. Although supported with relatively low bootstrap values (75% and 65% in the NJ and ML analyses, respectively), three anisogamous/oogamous volvocacean genera— Eudorina, Pleodorina, and Volvox, excluding the section Volvox (= Euvolvox, illegitimate name), constituted a large monophyletic group (Eudorina group). Outside the Eudorina group, a robust lineage composed of three species of Volvox sect. Volvox was resolved as in the rbc L gene trees, rejecting the hypothesis of the previous cladistic analysis based on morphological data that the genus Volvox is monophyletic. In addition, the NJ and ML trees suggested that Eudorina is a nonmonophyletic genus as inferred from the morphological data and rbc L gene sequences. Although phylogenetic status of the genus Gonium is ambiguous in the rbc L gene trees and the paraphyly of this genus is resolved in the cladistic analysis based on morphological data, the atp B gene sequence data suggest monophyly of Gonium with relatively low bootstrap values (56–61%) in the NJ and ML trees. On the basis of the combined sequence data (2256 base pairs) from atp B and rbc L genes, Gonium was resolved as a robust monophyletic genus in the NJ and ML trees (with 68–86% bootstrap values), and Eudorina elegans Ehrenberg represented a paraphyletic species positioned most basally within the Eudorina group. However, phylogenetic status and relationships of the families of the colonial Volvocales were still almost ambiguous even in the combined analysis.  相似文献   

9.
Lake's evolutionary parsimony (EP) method of constructing a phylogenetic tree is primarily applied to four DNA sequences. In this method, three quantities--X, Y, and Z--that correspond to three possible unrooted trees are computed, and an invariance property of these quantities is used for choosing the best tree. However, Lake's method depends on a number of unrealistic assumptions. We therefore examined the theoretical basis of his method and reached the following conclusions: (1) When the rates of two transversional changes from a nucleotide are unequal, his invariance property breaks down. (2) Even if the rates of two transversional changes are equal, the invariance property requires some additional conditions. (3) When Kimura's two- parameter model of nucleotide substitution applies and the rate of nucleotide substitution varies greatly with branch, the EP method is generally better than the standard maximum-parsimony (MP) method in recovering the correct tree but is inferior to the neighbor-joining (NJ) and a few other distance matrix methods. (4) When the rate of nucleotide substitution is the same or nearly the same for all branches, the EP method is inferior to the MP method even if the proportion of transitional changes is high. (5) When Lake's assumptions fail, his chi2 test may identify an erroneous tree as the correct tree. This happens because the test is not for comparing different trees. (6) As long as a proper distance measure is used, the NJ method is better than the EP and MP methods whether there is a transition/transversion bias or whether there is variation in substitution rate among different nucleotide sites.   相似文献   

10.
Tie trees generated by distance methods of phylogenetic reconstruction   总被引:2,自引:0,他引:2  
In examining genetic data in recent publications, Backeljau et al. showed cases in which two or more different trees (tie trees) were constructed from a single data set for the neighbor-joining (NJ) method and the unweighted pair group method with arithmetic mean (UPGMA). However, it is still unclear how often and under what conditions tie trees are generated. Therefore, I examined these problems by computer simulation. Examination of cases in which tie trees occur shows that tie trees can appear when no substitutions occur along some interior branch(es) on a tree. However, even when some substitutions occur along interior branches, tie trees can appear by chance if parallel or backward substitutions occur at some sites. The simulation results showed that tie trees occur relatively frequently for sequences with low divergence levels or with small numbers of sites. For such data, UPGMA sometimes produced tie trees quite frequently, whereas tie trees for the NJ method were generally rare. In the simulation, bootstrap values for clusters (tie clusters) that differed among tie trees were mostly low (< 60%). With a small probability, relatively high bootstrap values (at most 70%-80%) appeared for tie clusters. The bias of the bootstrap values caused by an input order of sequence can be avoided if one of the different paths in the cycles of making an NJ or UPGMA tree is chosen at random in each bootstrap replication.   相似文献   

11.
The relative efficiencies of different protein-coding genes of the mitochondrial genome and different tree-building methods in recovering a known vertebrate phylogeny (two whale species, cow, rat, mouse, opossum, chicken, frog, and three bony fish species) was evaluated. The tree-building methods examined were the neighbor joining (NJ), minimum evolution (ME), maximum parsimony (MP), and maximum likelihood (ML), and both nucleotide sequences and deduced amino acid sequences were analyzed. Generally speaking, amino acid sequences were better than nucleotide sequences in obtaining the true tree (topology) or trees close to the true tree. However, when only first and second codon positions data were used, nucleotide sequences produced reasonably good trees. Among the 13 genes examined, Nd5 produced the true tree in all tree-building methods or algorithms for both amino acid and nucleotide sequence data. Genes Cytb and Nd4 also produced the correct tree in most tree-building algorithms when amino acid sequence data were used. By contrast, Co2, Nd1, and Nd41 showed a poor performance. In general, large genes produced better results, and when the entire set of genes was used, all tree-building methods generated the true tree. In each tree-building method, several distance measures or algorithms were used, but all these distance measures or algorithms produced essentially the same results. The ME method, in which many different topologies are examined, was no better than the NJ method, which generates a single final tree. Similarly, an ML method, in which many topologies are examined, was no better than the ML star decomposition algorithm that generates a single final tree. In ML the best substitution model chosen by using the Akaike information criterion produced no better results than simpler substitution models. These results question the utility of the currently used optimization principles in phylogenetic construction. Relatively simple methods such as the NJ and ML star decomposition algorithms seem to produce as good results as those obtained by more sophisticated methods. The efficiencies of the NJ, ME, MP, and ML methods in obtaining the correct tree were nearly the same when amino acid sequence data were used. The most important factor in constructing reliable phylogenetic trees seems to be the number of amino acids or nucleotides used.   相似文献   

12.
A phylogenetic method is a consistent estimator of phylogeny if and only if it is guaranteed to give the correct tree, given that sufficient (possibly infinite) independent data are examined. The following methods are examined for consistency: UPGMA (unweighted pair-group method, averages), NJ (neighbor joining), MF (modified Farris), and P (parsimony). A two-parameter model of nucleotide sequence substitution is used, and the expected distribution of character states is calculated. Without perfect correction for superimposed substitutions, all four methods may be inconsistent if there is but one branch evolving at a faster rate than the other branches. Partial correction of observed distances improves the robustness of the NJ method to rate variation, and perfect correction makes the NJ method a consistent estimator for all combinations of rates that were examined. The sensitivity of all the methods to unequal rates varies over a wide range, so relative-rate tests are unlikely to be a reliable guide for accepting or rejecting phylogenies based on parsimony analysis.  相似文献   

13.
The relative efficiencies of the maximum parsimony (MP) and distance-matrix methods in obtaining the correct tree (topology) were studied by using computer simulation. The distance-matrix methods examined are the neighbor-joining, distance-Wagner, Tateno et al. modified Farris, Faith, and Li methods. In the computer simulation, six or eight DNA sequences were assumed to evolve following a given model tree, and the evolutionary changes of the sequences were followed. Both constant and varying rates of nucleotide substitution were considered. From the sequences thus obtained, phylogenetic trees were constructed using the six tree-making methods and compared with the model (true) tree. This process was repeated 300 times for each different set of parameters. The results obtained indicate that when the number of nucleotide substitutions per site is small and a relatively small number of nucleotides are used, the probability of obtaining the correct topology (P1) is generally lower in the MP method than in the distance-matrix methods. The P1 value for the MP method increases with increasing number of nucleotides but is still generally lower than the value for the NJ or DW method. Essentially the same conclusion was obtained whether or not the rate of nucleotide substitution was constant or whether or not a transition bias in nucleotide substitution existed. The relatively poor performance of the MP method for these cases is due to the fact that information from singular sites is not used in this method. The MP method also showed a relatively low P1 value when the model of varying rate of nucleotide substitution was used and the number of substitutions per site was large. However, the MP method often produced cases in which the correct tree was one of several equally parsimonious trees. When these cases were included in the class of "success," the MP method performed better than the other methods, provided that the number of nucleotide substitutions per site was small.  相似文献   

14.
Wang JB  Wang C  Shi SH  Zhong Y 《Hereditas》2000,132(3):209-213
The nucleotide sequences of the internal transcribed spacer (ITS) of nuclear ribosomal DNA in nine diploid species representing six sections of Aegilops were determined by direct sequencing of PCR-amplified DNA fragments. These sequences were aligned with two ITS sequences of additional species from Genbank. Sequence divergences were estimated using Kimura two-parameter model, and the phylogenetic analyses were performed using the maximum parsimony (MP) and the neighbor-joining (NJ) methods with PAUP and PHYLIP, respectively. The sequence divergences between the diploid species varied from 0.5% to 4.68%. The resulting MP tree and NJ tree showed relatively congruent phylogenetic relationships among these species, except Ae. caudata. Particularly, Ae. speltoides was basal within the two trees. The paraphyletic relationships between Ae. speltoides and two species of Sect. Sitopsis, and between Ae. uniaristata and two species of Sect. Comopyrum were supported strongly. The ITS data suggest that currently recognized sections within Aegilops should be reconsidered.  相似文献   

15.
王江  方盛国 《兽类学报》2005,25(2):105-114
原羚属物种在羚羊亚科中的分类地位尚存在很多争议。本文测定了原羚属的黄羊和藏原羚细胞色素b基因全序列(1140bp),并与牛科其它属31个种的同源序列进行比较,对其碱基组成变异情况及核苷酸序列差异进行了分析。基于细胞色素b基因全序列,用简约法(MP)、邻接法(NJ)和似然法(ML)构建了系统进化树。结果表明:黄羊和藏原羚的序列差异为3.78%,颠换数目近乎为0,其突变远未饱和;原羚属内黄羊和藏原羚为不同种,单系发生;原羚属与赛加羚羊属、犬羚属及跳羚属等并系发生,原羚属隶属于羚羊亚科,应为独立属;羚羊亚科组成属间多为并系起源。根据序列差异值2%/百万年的细胞色素6分子钟,推测黄羊和藏原羚分歧时间大约为1~2百万年;原羚属与羚羊亚科其它属分歧时间大约在5.7~8百万年。  相似文献   

16.
In phylogenetic inference by maximum-parsimony (MP), minimum-evolution (ME), and maximum-likelihood (ML) methods, it is customary to conduct extensive heuristic searches of MP, ME, and ML trees, examining a large number of different topologies. However, these extensive searches tend to give incorrect tree topologies. Here we show by extensive computer simulation that when the number of nucleotide sequences (m) is large and the number of nucleotides used (n) is relatively small, the simple MP or ML tree search algorithms such as the stepwise addition (SA) plus nearest neighbor interchange (NNI) search and the SA plus subtree pruning regrafting (SPR) search are as efficient as the extensive search algorithms such as the SA plus tree bisection-reconnection (TBR) search in inferring the true tree. In the case of ME methods, the simple neighbor-joining (NJ) algorithm is as efficient as or more efficient than the extensive NJ+TBR search. We show that when ME methods are used, the simple p distance generally gives better results in phylogenetic inference than more complicated distance measures such as the Hasegawa-Kishino-Yano (HKY) distance, even when nucleotide substitution follows the HKY model. When ML methods are used, the simple Jukes-Cantor (JC) model of phylogenetic inference generally shows a better performance than the HKY model even if the likelihood value for the HKY model is much higher than that for the JC model. This indicates that at least in the present case, selecting of a substitution model by using the likelihood ratio test or the AIC index is not appropriate. When n is small relative to m and the extent of sequence divergence is high, the NJ method with p distance often shows a better performance than ML methods with the JC model. However, when the level of sequence divergence is low, this is not the case.  相似文献   

17.
The neighbor-joining (NJ) method is widely used in reconstructing large phylogenies because of its computational speed and the high accuracy in phylogenetic inference as revealed in computer simulation studies. However, most computer simulation studies have quantified the overall performance of the NJ method in terms of the percentage of branches inferred correctly or the percentage of replications in which the correct tree is recovered. We have examined other aspects of its performance, such as the relative efficiency in correctly reconstructing shallow (close to the external branches of the tree) and deep branches in large phylogenies; the contribution of zero-length branches to topological errors in the inferred trees; and the influence of increasing the tree size (number of sequences), evolutionary rate, and sequence length on the efficiency of the NJ method. Results show that the correct reconstruction of deep branches is no more difficult than that of shallower branches. The presence of zero-length branches in realized trees contributes significantly to the overall error observed in the NJ tree, especially in large phylogenies or slowly evolving genes. Furthermore, the tree size does not influence the efficiency of NJ in reconstructing shallow and deep branches in our simulation study, in which the evolutionary process is assumed to be homogeneous in all lineages. Received: 7 March 2000 / Accepted: 2 August 2000  相似文献   

18.
为探讨鳞翅目中绢丝昆虫之间的系统发育关系和分子进化特征,本研究测定了中国柞蚕Antheraea pernyi野生型和放养型的线粒体12S rRNA基因的部分序列,结合来自GenBank数据库的17条序列,对总共9种绢丝昆虫(2科3属)的12S rRNA基因序列进行了分析。利用软件MEGA 3.1进行碱基组成、变异位点的统计和分子进化分析,分别用类平均聚类法(UPGMA)、邻接法(NJ)、最小进化法(ME)、最大简约法(MP)重建系统发生树。测定的中国柞蚕野生型的12S rRNA基因序列(427 bp)与放养型“豫早1号”的序列完全一致。序列对齐后共鉴定80个变异位点,50个简约信息位点。碱基组成分析显示在科属间具有明显差异,AT含量蚕蛾科高于大蚕蛾科;在A和T碱基的使用上,大蚕蛾科偏好使用T,而蚕蛾科则偏好使用A。与动物中常见的以转换为主的碱基替换模式不同,所分析的9种昆虫中除桑蚕属内部为转换与颠换基本一致外,其余物种间均是颠换多于转换。进化分析支持柞蚕属、樗蚕属和桑蚕属的单系。基于UPGMA法的进化树支持琥珀蚕是柞蚕属的较原始类型,而NJ、ME和MP法则支持印度柞蚕是较原始的类型,因此,柞蚕属种间的进化关系尚需进一步研究。  相似文献   

19.
对16头雷琼牛GH基因第5外显子序列进行分析,发现了1个变异位点,定义了2种单倍型。引用巴州牦牛2个个体GH基因同源区序列并结合GenBank中牛属普通牛、其它瘤牛和牦牛3个种群与水牛1个远缘种GH基因同源区序列,分别采用邻接(NJ)法和最大简约(MP)法构建分子系统发育树,得到基本一致的拓扑结构,结果显示GH基因的分化早于雷琼牛(瘤牛)、其它瘤牛、普通牛、牦牛和水牛的分化,瘤牛物种内存在多型,同时证实了GH基因第5外显子区有着较高的突变率。  相似文献   

20.
基于12S rRNA基因序列探讨崇安地蜥的分类地位   总被引:1,自引:0,他引:1  
为探讨崇安地蜥Platyplacous sylvaticus的分类地位,测定了崇安地蜥线粒体12S rRNA基因全序列,并从GenBank中下载了东亚产10种草蜥、3种地蜥的同源序列进行分析,采用Mega V2.1软件的NJ法和ME法、PAUP4.0软件的MP法构建分子系统树.结果表明:崇安地蜥线粒体12S rRNA基因全序列(952 bp)中T、C、A、G碱基含量分别为23.1%、22.9%、35.9%、18.1%;与其它同源序列比对后有978 bp,发现321个位点出现变异,占总位点数32.8%,其中199个简约信息位点,为总变异位点的62%;转换/颠换之比平均为2.16.构建的分子系统树中,NJ树和ME树完全一致,与MP树略有差异.3种构树法中崇安地蜥与南台草蜥Takydromus sauteri、峨眉地蜥P.intermedius、先岛地蜥P.dorsalis均聚为一支,崇安地蜥与先岛地蜥亲缘关系最近.本实验结果支持将地蜥属并入草蜥属和取消地蜥亚属的观点.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号