首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
组分矢量构树(CVTree)方法是基于全基因组的、不用序列联配的物种亲缘关系研究方法。CVTree3是我们最新开发的CVTree网络服务器,它基于并行化的核心程序,以适应当前基因组数据的海量增加;它自动对比物种亲缘关系与分类系统,并在网页上以交互作用形式显示,从而使研究更加直观。使用CVTree3网络服务器,用户可以快速的对未知的全基因组序列进行亲缘关系分析,并对其分类地位进行初步鉴定。由于合理利用全基因组信息,CVTree方法能对种以下的亲缘关系与分类具有高分辨力。随着CVTree方法的深入与完善,希望其能成为阐明原核生物亲缘关系与分类系统的定义性的工具。  相似文献   

2.
微生物基因组和RNA序列构成生物学数据的重要部分.从基因组出发而且不用序列联配的CVTree和基于16S rRNA序列联配的LVTree,是两套原始数据和计算过程相互独立的构建原核生物亲缘树和分类系统的途径.这两套途径的自动化,使亲缘关系和分类系统成为大数据分析的副产品,可以帮助后继乏人的分类学摆脱困境.特别是基于基因组的CVTree,既提供了大范围研究的工具,又在种以下具有16SrRNA序列分析所不能企及的高分辨力,可以提出和解决一批新问题,开辟若干新方向.本文是相关研究工作的扼要综述.  相似文献   

3.
鸟类作为脊椎动物中的主要类群之一,其内部演化关系一直是动物学研究的重点。形态特征及分子特征在推断鸟类内部类群间的系统发育关系时被广泛应用,但长久以来受制于同质性的影响,不同形态特征难以得到一致的系统发育关系。随着基因组测序技术及分析方法的快速发展,基于大量DNA序列数据所建的物种树成为领域内普遍认可的鸟类系统发育关系。在此基础之上,重新挖掘形态特征的研究价值成为可能。本研究基于Prum发表于2015年的鸟类科级水平系统发育树,比较了不同形态特征同质性水平的差异,讨论了形态特征对于提高分子特征系统发育信号及分子树支持率的作用。研究发现,不同种类形态特征的同质性水平存在显著的差异,以一致性指数为标准,杂项特征(包括神经、肌腱及肠道的特征)的得分显著高于骨学特征和肌学特征(P < 0.01),颅骨特征的得分显著高于非颅骨、躯干及腿部特征的得分(P < 0.05);加入形态特征能够显著提高分子特征的系统发育信号(P < 0.05),也提高了分子树的支持率。综上所述,不同的形态特征在同质性水平的得分上具有显著差异,在分子特征上加入形态特征既能够提高分子特征的系统发育信号,也能够提高分子树的支持率。  相似文献   

4.
报道了槭树科41种(其中槭属39种)植物的 trn L-F和ITS序列(其中部分种的ITS序列为重新测定),以期通过分子手段对槭树科内部尤其是复杂的槭属的系统发育关系进行重建.以无患子科和七叶树科为外类群,基于对57个种单独的ITS序列(包括从GenBank下载的16种的序列)、41种 trn L-F序列及41种两者序列的联合数据,分别采用最大简约法(Maximum Parsimony Method)和邻接法(Neighbor-Joining Method)对槭树科的系统发育进行了分析.结果显示,整个槭树科为一单系类群;金钱槭位于槭树科的基部;但由于云南金钱槭( Dipteronia dyerana )聚在了槭属内部,认为金钱槭属和槭属均可能是非单系类群;槭属内组间关系的支持率普遍较低,但多数组的组内关系得到了较好的支持.将两个片段结合比单独的ITS或 trn L-F分析能更好地解决槭属内部的系统关系,其中sect.Palmata 和sect.Microcarpa ,sect.Platanoidea 、sect.Lithocarpa 和sect.Macrophylla ,sect.Integrifolia 、sect.Trifoliata 和sect.Pentaphylla ,以及sect.Acer 、sect.Goniocarpa 和sect.Saccharina (sensu Ogata)的组间亲缘关系得到了一定的支持,但对其中部分组的划分可能应做进一步调整.重新评价了徐廷志系统中对sect.Rubra 和sect.Saccharodendron 的处理.  相似文献   

5.
于黎  张亚平 《动物学研究》2006,27(6):657-665
追溯生物界不同生物类型的起源及进化关系,即重建生物类群的系统发育树是进化生物学领域中一个十分重要的内容。食肉目哺乳动物位于食物链顶端,很多成员不仅在我国野生动物保护工作中占有重要地位,而且还是研究动物适应性进化遗传机制的重要模式生物。因而,食肉目物种作为物种资源中的一个重要类群,其系统发育学一直是国内外研究的热门课题。构建可靠的食肉目分子系统树,无疑将具有重要的进化理论意义和保护生物学价值。鉴于目前食肉目各科间系统发育关系仍然处于“广泛争论”的状态,本文将针对食肉目科水平上的系统发育学研究进展,包括来自于形态学特征、细胞学及分子生物学方面的证据,做简要概述,并提出目前研究中存在的问题。这对今后食肉目系统发育方面的进一步研究工作具有指导意义,并为以该类群作为模式生物开展适应性进化研究奠定基础。  相似文献   

6.
基于ITS与trnL—F序列探讨槭树科的系统发育   总被引:10,自引:0,他引:10  
报道了槭树科41种(其中槭属39种0植物的trnL-F和ITS序列(其中部分种的ITS序列为重新测定),以期通过分子手段对槭树科内部尤其是复杂的槭属的系统发育关系进行重建。以无患子科和七叶树科为外类群,基于对57个种单独的ITS序列(包括从GenBank下载的16种的序列),41种trnL-F序列及41种两序列的联合数据,分析采用最大简约法(Maximum Parsimony Method)和邻接法(Neighbor-Joining Method)对槭树科的系统发育进行了分析。结果显示,整个槭树科为一单系类群;金钱槭位于槭树科的基部;但由于云南金钱槭(Dipteronia dyerana)聚在槭属内部,认为金钱槭属和槭属均可能是非单系类群;槭属内组间关系的支持率普遍较低,但多数组的组内关系得到了较好的支持。将两个片段结合比单独的ITS或trnL-F分析能更好地解决槭属内部的系统关系,其中sect,Palmata和sect.Micrcarpa,sect,Platanoidea,sect,Lithocarpa和sect.Macrophylla,sect,Integrifolia.Trifoliata和sect Pentaphylla,以及sect.Acer,sect.Goniocarpa和sect.Saccharina(sensu Ogata)的组间亲缘关系得到了一定的支持,但对其中部分组的划分可能应做进一步调整。重新评价了徐廷志系统中对sect.Rubra和sect.Saccharodendron的处理。  相似文献   

7.
于黎  张亚平 《动物学研究》2006,27(6):657-665
追溯生物界不同生物类型的起源及进化关系,即重建生物类群的系统发育树是进化生物学领域中一个十分重要的内容。食肉目哺乳动物位于食物链顶端,很多成员不仅在我国野生动物保护工作中占有重要地位,而且还是研究动物适应性进化遗传机制的重要模式生物。因而,食肉目物种作为物种资源中的一个重要类群,其系统发育学一直是国内外研究的热门课题。构建可靠的食肉目分子系统树,无疑将具有重要的进化理论意义和保护生物学价值。鉴于目前食肉目各科间系统发育关系仍然处于“广泛争论”的状态,本文将针对食肉目科水平上的系统发育学研究进展,包括来自于形态学特征、细胞学及分子生物学方面的证据,做简要概述,并提出目前研究中存在的问题。这对今后食肉目系统发育方面的进一步研究工作具有指导意义,并为以该类群作为模式生物开展适应性进化研究奠定基础。  相似文献   

8.
哺乳动物是一类最进化并在地球上占主导地位的动物类群,重建其系统发育关系一直是分子系统学的研究热点。随着越来越多物种全基因组测序的完成,在基因组水平上探讨该类动物的系统发育关系与进化成为研究的热点。本文从全基因组序列,稀有基因组变异及染色体涂染等几个方面简要介绍了当前系统发育基因组学在现生哺乳动物分子系统学中的应用,综合已有的研究归纳整理了胎盘亚纲的总目及目间的系统发育关系,给出了胎盘动物19 个目的系统发育树。本文还分析了哺乳动物系统发育基因组学目前所面临的主要问题及未来的发展前景。  相似文献   

9.
近年来人们在十字花科物种系统发生关系方面开展了大量工作,研究发现十字花科可分为3个主要类群,但是这些类群内部以及类群间的进化关系还不明确。旨在快速准确地解决十字花科物种系统发生关系,通过选取39个十字花科物种及两个外类群物种作为研究材料,使用系统发生基因组学方法获得了覆盖所选物种的低拷贝同源基因集合。进一步通过CVTree方法分析低拷贝核基因的组分特征,得到了高度支持与稳定的十字花科系统发育关系。结果显示,十字花科被分为6个主要的类群,其中3个主要类群的划分与前人的分类结果高度一致,并且增加了两个新类群,此外,前人研究中存在争议的第二类群在本研究结果中成为有稳定支持的单系群。表明基于大量低拷贝同源基因集合并结合组分矢量分析,可以较为准确地反映十字花科物种的系统发生关系。因此,CVTree方法不仅适用于研究原核生物、真菌等微生物的系统发生关系,也可以用来探究十字花科植物等高等生物的亲缘关系。  相似文献   

10.
短体线虫又称根腐线虫,是世界分布最为广泛和最具破坏性的迁徙性植物内寄生线虫之一.本研究根据214条核糖体ITS序列,218条核糖体28S大亚基D2~D3序列,应用MEGA 4.0软件,通过邻接法(NJ)构建了短体线虫的系统发育树.结果发现,2个系统树在整体上大致相似,仅在小的分支上存在差异.基于ITS序列的系统树将25种短体线虫至少分为8组,相应的基于D2~D3序列的系统树将23种短体线虫至少分为7组.其中,有3个大组内部的系统发育关系比较清晰.根据本研究的系统发育分析仍然无法从总体上确定短体线虫种间的系统发育关系.  相似文献   

11.
基于DNA序列K-tuple分布的一种非序列比对分析   总被引:1,自引:0,他引:1  
沈娟  吴文武  解小莉  郭满才  袁志发 《遗传》2010,32(6):606-612
文章在基因组K-tuple分布的基础上, 给出了一种推测生物序列差异大小的非序列比对方法。该方法可用于衡量真实DNA序列和随机重排序列在K-tuple分布上的差异。将此方法用于构建含有26种胎盘哺乳动物线粒体全基因组的系统树时, 随着K的增大, 系统树的分类效果与生物学一致公认的结果愈加匹配。结果表明, 用此方法构建的系统进化树比用其他非序列比对分析方法构建的更加合理。  相似文献   

12.
The process of inferring phylogenetic trees from molecular sequences almost always starts with a multiple alignment of these sequences but can also be based on methods that do not involve multiple sequence alignment. Very little is known about the accuracy with which such alignment-free methods recover the correct phylogeny or about the potential for increasing their accuracy. We conducted a large-scale comparison of ten alignment-free methods, among them one new approach that does not calculate distances and a faster variant of our pattern-based approach; all distance-based alignment-free methods are freely available from http://www.bioinformatics.org.au (as Python package decaf+py). We show that most methods exhibit a higher overall reconstruction accuracy in the presence of high among-site rate variation. Under all conditions that we considered, variants of the pattern-based approach were significantly better than the other alignment-free methods. The new pattern-based variant achieved a speed-up of an order of magnitude in the distance calculation step, accompanied by a small loss of tree reconstruction accuracy. A method of Bayesian inference from k-mers did not improve on classical alignment-free (and distance-based) methods but may still offer other advantages due to its Bayesian nature. We found the optimal word length k of word-based methods to be stable across various data sets, and we provide parameter ranges for two different alphabets. The influence of these alphabets was analyzed to reveal a trade-off in reconstruction accuracy between long and short branches. We have mapped the phylogenetic accuracy for many alignment-free methods, among them several recently introduced ones, and increased our understanding of their behavior in response to biologically important parameters. In all experiments, the pattern-based approach emerged as superior, at the expense of higher resource consumption. Nonetheless, no alignment-free method that we examined recovers the correct phylogeny as accurately as does an approach based on maximum-likelihood distance estimates of multiply aligned sequences.  相似文献   

13.
Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.  相似文献   

14.
Molecular sequences provide a rich source of data for inferring the phylogenetic relationships among species. However, recent work indicates that even an accurate multiple alignment of a large sequence set may yield an incorrect phylogeny and that the quality of the phylogenetic tree improves when the input consists only of the highly conserved, motif regions of the alignment. This work introduces two methods of producing multiple alignments that include only the conserved regions of the initial alignment. The first method retains conserved motifs, whereas the second retains individual conserved sites in the initial alignment. Using parsimony analysis on a mitochondrial data set containing 19 species among which the phylogenetic relationships are widely accepted, both conserved alignment methods produce better phylogenetic trees than the complete alignment. Unlike any of the 19 inference methods used before to analyze this data, both methods produce trees that are completely consistent with the known phylogeny. The motif-based method employs far fewer alignment sites for comparable error rates. For a larger data set containing mitochondrial sequences from 39 species, the site-based method produces a phylogenetic tree that is largely consistent with known phylogenetic relationships and suggests several novel placements. J. Exp. Zool. ( Mol. Dev. Evol.) 285:128-139, 1999.  相似文献   

15.
Little DP 《PloS one》2011,6(8):e20552
For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple-sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple-sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are unable to accurately differentiate between highly similar sequences and are not designed to cope with hierarchic phylogenetic relationships or within taxon variability. Here, I present a novel alignment-free sequence identification algorithm--BRONX--that accounts for observed within taxon variability and hierarchic relationships among taxa. BRONX identifies short variable segments and corresponding invariant flanking regions in reference sequences. These flanking regions are used to score variable regions in the query sequence without the production of a global multiple-sequence alignment. By incorporating observed within taxon variability into the scoring procedure, misidentifications arising from shared alleles/haplotypes are minimized. An explicit treatment of more inclusive terminals allows for separate identifications to be made for each taxonomic level and/or for user-defined terminals. BRONX performs better than all other methods when there is imperfect overlap between query and reference sequences (e.g. mini-barcode queries against a full-length barcode database). BRONX consistently produced better identifications at the genus-level for all query types.  相似文献   

16.

Background  

The vast sequence divergence among different virus groups has presented a great challenge to alignment-based analysis of virus phylogeny. Due to the problems caused by the uncertainty in alignment, existing tools for phylogenetic analysis based on multiple alignment could not be directly applied to the whole-genome comparison and phylogenomic studies of viruses. There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among the alignment-free methods, a dynamical language (DL) method proposed by our group has successfully been applied to the phylogenetic analysis of bacteria and chloroplast genomes.  相似文献   

17.
18.
Traditional phylogenetic analysis is based on multiple sequence alignment. With the development of worldwide genome sequencing project, more and more completely sequenced genomes become available. However, traditional sequence alignment tools are impossible to deal with large-scale genome sequence. So, the development of new algorithms to infer phylogenetic relationship without alignment from whole genome information represents a new direction of phylogenetic study in the post-genome era. In the present study, a novel algorithm based on BBC (base-base correlation) is proposed to analyze the phylogenetic relationships of HEV (Hepatitis E virus). When 48 HEV genome sequences are analyzed, the phylogenetic tree that is constructed based on BBC algorithm is well consistent with that of previous study. When compared with methods of sequence alignment, the merit of BBC algorithm appears to be more rapid in calculating evolutionary distances of whole genome sequence and not requires any human intervention, such as gene identification, parameter selection. BBC algorithm can serve as an alternative to rapidly construct phylogenetic trees and infer evolutionary relationships.  相似文献   

19.
Highly accurate estimation of phylogenetic trees for large data sets is difficult, in part because multiple sequence alignments must be accurate for phylogeny estimation methods to be accurate. Coestimation of alignments and trees has been attempted but currently only SATé estimates reasonably accurate trees and alignments for large data sets in practical time frames (Liu K., Raghavan S., Nelesen S., Linder C.R., Warnow T. 2009b. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 324:1561-1564). Here, we present a modification to the original SATé algorithm that improves upon SATé (which we now call SATé-I) in terms of speed and of phylogenetic and alignment accuracy. SATé-II uses a different divide-and-conquer strategy than SATé-I and so produces smaller more closely related subsets than SATé-I; as a result, SATé-II produces more accurate alignments and trees, can analyze larger data sets, and runs more efficiently than SATé-I. Generally, SATé is a metamethod that takes an existing multiple sequence alignment method as an input parameter and boosts the quality of that alignment method. SATé-II-boosted alignment methods are significantly more accurate than their unboosted versions, and trees based upon these improved alignments are more accurate than trees based upon the original alignments. Because SATé-I used maximum likelihood (ML) methods that treat gaps as missing data to estimate trees and because we found a correlation between the quality of tree/alignment pairs and ML scores, we explored the degree to which SATé's performance depends on using ML with gaps treated as missing data to determine the best tree/alignment pair. We present two lines of evidence that using ML with gaps treated as missing data to optimize the alignment and tree produces very poor results. First, we show that the optimization problem where a set of unaligned DNA sequences is given and the output is the tree and alignment of those sequences that maximize likelihood under the Jukes-Cantor model is uninformative in the worst possible sense. For all inputs, all trees optimize the likelihood score. Second, we show that a greedy heuristic that uses GTR+Gamma ML to optimize the alignment and the tree can produce very poor alignments and trees. Therefore, the excellent performance of SATé-II and SATé-I is not because ML is used as an optimization criterion for choosing the best tree/alignment pair but rather due to the particular divide-and-conquer realignment techniques employed.  相似文献   

20.
《Genomics》2019,111(6):1574-1582
Given the vast amount of genomic data, alignment-free sequence comparison methods are required due to their low computational complexity. k-mer based methods can improve comparison accuracy by extracting an effective feature of the genome sequences. The aim of this paper is to extract k-mer intervals of a sequence as a feature of a genome for high comparison accuracy. In the proposed method, we calculated the distance between genome sequences by comparing the distribution of k-mer intervals. Then, we identified the classification results using phylogenetic trees. We used viral, mitochondrial (MT), microbial and mammalian genome sequences to perform classification for various genome sets. We confirmed that the proposed method provides a better classification result than other k-mer based methods. Furthermore, the proposed method could efficiently be applied to long sequences such as human and mouse genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号