首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
以7种古菌、46种细菌和10种真核生物的基因组为样本,考虑碱基间的短程关联和长程关联作用,得到编码序列的密码对和基因间序列的三联体对中不同位点的二核苷酸频率,据此构建了基于编码序列和基因间序列的系统发生关系。无论是基于编码序列还是基因间序列对信息进行聚类,古菌或真核均被聚在一支上,表明聚类参数的选择是合适的;与基于氨基酸序列构建的系统发生关系进行两两比较,发现大部分硬壁菌的编码序列与基因间序列之间,以及编码序列与氨基酸序列之间的进化都存在较大差异。通过分析认为,只有综合考虑这三类序列的进化信息,才可能得到更自然的系统发生关系。  相似文献   

2.
构建系统发生树时,其拓扑结构会在不同的基因组区域产生不一致性。对此问题,贝叶斯一致性分析法(BCA)可在全基因组规模上进行系统发生树分析,并进而对不一致性信息进行量化统计。采用此方法对由C3H/Hu小鼠(Mus musculus)和129Sv小鼠回交多代产生的129S1小鼠进行系统发生树分析,输入相应的一组序列文件,用若干生物信息学软件(如VCFtools,Repeat Masker,PAUP*4.0,Mr Model Test,Mr Bayes等)对其进行屏蔽重复序列、序列比对等处理,辅以Perl语言脚本,最终得到全基因组范围不同区段系统发生树不一致信息。在小鼠10号染色体的所有99个基因座中,支持129S1和129Sv品系小鼠为姐妹关系的拓扑结构占了84.7%(后验概率最高),这证明了C3H/Hu小鼠对129S1小鼠基因组的贡献程度较小。结果表明,贝叶斯一致性分析法有助于基因组不同区段进化历史的研究。  相似文献   

3.
orthologs指起源于不同物种的最近的共同祖先的一些基因。orthologous的基因,具有相近甚至相同的功能,由相似的途径调控,在不同的物种中扮演相似甚至相同的角色,因此在基因组序列的注释中,是最可靠的选择。orthologs的生物信息预测方法主要有两类:系统发生方法和序列比对方法。这两类方法都是基于序列的相似性,但又各有特点。系统发生方法通过重建系统发生树来预测orthologs,因此在概念上比较精确,但难于自动化,运算量也很大。序列比对方法在概念上比较粗糙,但简单实用,运算量相对较小,因此得到了较广泛的应用。  相似文献   

4.
刘俊宏  李春 《生物信息学》2013,11(2):142-145
借助DNA序列中k-字的频数,将序列转化成一个340维向量,进而计算物种间的进化距离。作为应用:分别以15个物种的β球蛋白基因、13种汉坦病毒的S片段以及26个闭壳龟线粒体基因为例,构建系统发生树,所得结果与前人的结论一致,说明了该方法的有效性。  相似文献   

5.
叶绿体基因infA-rpl36区域在小麦族物种中的序列变异分析   总被引:3,自引:1,他引:2  
刘畅  杨足君  李光蓉  冯娟  邓科君  黄健  任正隆 《遗传》2006,28(10):1265-1272
利用小麦叶绿体基因组中infA-rpl36区域的序列设计引物, 对小麦族(Triticeae)的12个二倍体和多倍体的物种进行了PCR扩增和序列测定, 获得了长度为584~603 bp的12条DNA序列。序列分析表明, 供试物种在infA-rpl36基因间隔区的核苷酸变异明显高于基因编码区。基因编码区核苷酸序列同源性高达97%, 表明了目标片段具有高度的保守性。但在5个物种的infA编码区出现了较大的插入、缺失突变, 导致推导的氨基酸序列也发生了很大的变化, 证实了infA基因是叶绿体基因组中最活跃的基因之一, 而rpl36基因的变异较小, 说明不同叶绿体基因的进化速度是不同的。基于测定序列建立的种系树分析发现, 多倍体物种中间偃麦草(Thinopyrum intermedium)具有多种不同的细胞质起源, 与核基因组一样在进化上较为复杂。  相似文献   

6.
利用已报道的黑腹果蝇U83基因搜索果蝇基因数据库,鉴定了10种新的果蝇科U83同源基因,它们均位于相应物种核蛋白基因rpl3的内含子中。以冈比亚按蚊为外类群,对11种果蝇的U83核苷酸序列作进化关系分析,用邻接法重建了系统发生树,结果与传统方法构建的系统发生树相比,能反映果蝇科的大致进化关系,但还存在部分差别。为增加序列信息,把序列长度拓展至整个U83所在的内含子,同法构建系统发生树,结果与传统系统发生树几乎完全一致。该研究是用boxC/D snoRNA基因序列构建系统发生树的首次尝试,实验结果证明U83可以很好地用于构建果蝇科内各物种的种系发生树。  相似文献   

7.
对麝Moschus spp.的分子系统进化地位进行了再研究.结果 表明,不同的基因及序列长度、分析中不同的物种数目、不同的分析方法对研究结果产生明显影响.在用线粒体Cyt b、16S rRNA基因及二者的连接序列分别构建的NJ、MP树中,都支持麝与鹿有更近的亲缘关系.在使用γ干扰素核基因的编码序列构建的NJ树中,显示麝与鹿也有较近的亲缘关系;而MP树则暗示麝与牛科有更近的亲缘关系.当选用包括林麝在内的18个物种,使用线粒体基因组重链上12个蛋白编码基因的核苷酸串连序列构建系统发生树时,在NJ、MP、ML和BI树中都支持麝与鹿有更近的亲缘关系,与选用23个物种得到的支持麝为鹿科/牛科二者共同姐妹群的结果存在明显差异.这可能是因为麝科与鹿科和牛科的亲缘关系较近,分歧时间较短,其分子片段所累积的进化信息较少,而且不同的分子片段进化速率不一致等造成的.因此,要彻底解决麝在偶蹄目中的进化地位必须要找到更适合的分子标记.  相似文献   

8.
mtDNA基因树拓扑距离比较和基因分群   总被引:1,自引:0,他引:1  
基因树间拓扑距离数据的比较进一步证明:与分割拓扑距离相比,能经拓扑距离是一种更为精确的测度,利用相对通经拓扑距离构建了8个基因的拓扑距离树。基因的拓扑距离树能直观地反映不同基因树的拓扑结构差异大小,可用来对基因进行分群。此外,发现不同DNA序列用于构建多基因树中其系统发生信息存在“累加”,“合取”,“含盖”,“相斥”等数学关系。这可解释在mtDNA基因组中一些基因比另一些基因更适合用来的构建树的结果。结果提示从GenBank中应选择具有累加基因的DNA序列或蛋白质氨基酸序列合并来构建物种。在讨论中还提出了一种获得真树的新建树策略。  相似文献   

9.
鸟类是四足类动物中最丰富的一类脊椎动物,本研究以12种鸟类的全基因组核苷酸序列数据为研究对象,建立核苷酸频数进化方程,研究了鸟类基因组核苷酸频数的进化机制和规律。通过拟合基因组数据确定了方程中的进化惯性参数、耗散参数和环境参数,估算出进化速率,得到了基因组长度随时间的演化曲线,解出了基因组在短时间内快速增加,信息快速积累,然后进入进化停滞阶段,核苷酸频数不再明显变化。本研究的方法为定量研究鸟类和一般物种的进化提供了新的思路。  相似文献   

10.
日本条螽完整的线粒体基因组序列长16 281 bp,包括13个蛋白质编码基因、22个tRNA基因、2个r RNA基因和1个D-loop区,其基因次序和方向与祖先序列相同。该线粒体基因组排列紧凑,但在ND2和tRNA~(Trp)之间有一段长为650 bp的基因间隔区。为研究螽斯科的系统发育关系,本研究选取日本条螽及其它17个螽斯科物种线粒体基因组的蛋白质编码基因和r RNA基因序列构建贝叶斯系统发生树。  相似文献   

11.
Wei C  Wang G  Chen X  Huang H  Liu B  Xu Y  Li F 《PloS one》2011,6(10):e26296
Identification and typing of human enterovirus (HEVs) are important to pathogen detection and therapy. Previous phylogeny-based typing methods are mainly based on multiple sequence alignments of specific genes in the HEVs, but the results are not stable with respect to different choices of genes. Here we report a novel method for identification and typing of HEVs based on information derived from their whole genomes. Specifically, we calculate the k-mer based barcode image for each genome, HEV or other human viruses, for a fixed k, 1相似文献   

12.
The complete sequenced genomes of chloroplast have provided much information on the origin and evolution of this organelle. In this paper we attempt to use these sequences to test a novel approach for phylogenetic analysis of complete genomes based on correlation analysis of compositional vectors. All protein sequences from 21 complete chloroplast genomes are analyzed in comparison with selected archaea, eubacteria, and eukaryotes. The distance-based analysis shows that the chloroplast genomes are most closely related to cyanobacteria, consistent with the endosymbiotic origin of chloroplasts. The chloroplast genomes are separated to two major clades corresponding to chlorophytes (green plants) s.l. and rhodophytes (red algae) s.l. The interrelationships among the chloroplasts are largely in agreement with the current understanding on chloroplast evolution. For instance, the analysis places the chloroplasts of two chromophytes (Guillardia and Odontella) within the rhodophyte lineage, supporting secondary endosymbiosis as the source of these chloroplasts. The relationships among the green algae and land plants in our tree also agree with results from traditional phylogenetic analyses. Thus, this study establishes the value of our simple correlation analysis in elucidating the evolutionary relationships among genomes. It is hoped that this approach will provide insights on comparative genome analysis.  相似文献   

13.
Use of whole genome sequence data to infer baculovirus phylogeny   总被引:18,自引:0,他引:18       下载免费PDF全文
Several phylogenetic methods based on whole genome sequence data were evaluated using data from nine complete baculovirus genomes. The utility of three independent character sets was assessed. The first data set comprised the sequences of the 63 genes common to these viruses. The second set of characters was based on gene order, and phylogenies were inferred using both breakpoint distance analysis and a novel method developed here, termed neighbor pair analysis. The third set recorded gene content by scoring gene presence or absence in each genome. All three data sets yielded phylogenies supporting the separation of the Nucleopolyhedrovirus (NPV) and Granulovirus (GV) genera, the division of the NPVs into groups I and II, and species relationships within group I NPVs. Generation of phylogenies based on the combined sequences of all 63 shared genes proved to be the most effective approach to resolving the relationships among the group II NPVs and the GVs. The history of gene acquisitions and losses that have accompanied baculovirus diversification was visualized by mapping the gene content data onto the phylogenetic tree. This analysis highlighted the fluid nature of baculovirus genomes, with evidence of frequent genome rearrangements and multiple gene content changes during their evolution. Of more than 416 genes identified in the genomes analyzed, only 63 are present in all nine genomes, and 200 genes are found only in a single genome. Despite this fluidity, the whole genome-based methods we describe are sufficiently powerful to recover the underlying phylogeny of the viruses.  相似文献   

14.
《Genomics》2019,111(6):1574-1582
Given the vast amount of genomic data, alignment-free sequence comparison methods are required due to their low computational complexity. k-mer based methods can improve comparison accuracy by extracting an effective feature of the genome sequences. The aim of this paper is to extract k-mer intervals of a sequence as a feature of a genome for high comparison accuracy. In the proposed method, we calculated the distance between genome sequences by comparing the distribution of k-mer intervals. Then, we identified the classification results using phylogenetic trees. We used viral, mitochondrial (MT), microbial and mammalian genome sequences to perform classification for various genome sets. We confirmed that the proposed method provides a better classification result than other k-mer based methods. Furthermore, the proposed method could efficiently be applied to long sequences such as human and mouse genomes.  相似文献   

15.
A heuristic approach to search for the maximum-likelihood (ML) phylogenetic tree based on a genetic algorithm (GA) has been developed. It outputs the best tree as well as multiple alternative trees that are not significantly worse than the best one on the basis of the likelihood criterion. These near-optimum trees are subjected to further statistical tests. This approach enables ones to infer phylogenetic trees of over 20 taxa taking account of the rate heterogeneity among sites on practical time scales on a PC cluster. Computer simulations were conducted to compare the efficiency of the present approach with that of several likelihood-based methods and distance-based methods, using amino acid sequence data of relatively large (5–24) taxa. The superiority of the ML method over distance-based methods increases as the condition of simulations becomes more realistic (an incorrect model is assumed or many taxa are involved). This approach was applied to the inference of the universal tree based on the concatenated amino acid sequences of vertically descendent genes that are shared among all genomes whose complete sequences have been reported. The inferred tree strongly supports that Archaea is paraphyletic and Eukarya is specifically related to Crenarchaeota. Apart from the paraphyly of Archaea and some minor disagreements, the universal tree based on these genes is largely consistent with the universal tree based on SSU rRNA. Received: 4 January 2001 / Accepted: 16 May 2001  相似文献   

16.
We used complete sequence data from 30 complete Herpesviridae genomes to investigate phylogenetic relationships and patterns of genome evolution. The approach was to identify orthologous gene clusters among taxa and to generate a genomic matrix of gene content. We identified 17 genes with homologs in all 30 taxa and concatenated a subset of 10 of these genes for phylogenetic inference. We also constructed phylogenetic trees on the basis of gene content data. The amino acid and gene content phylogenies were largely concordant, but the amino acid data had much higher internal support. We mapped gene gain events onto the phylogenetic tree by assuming that genes were gained only once during the evolution of herpesviruses. Thirty genes were inferred to be present in the ancestor of all herpesvirus, a number smaller than previously hypothesized. Few genes of recent origin within herpesviruses could be identified as originating from transfer between virus and vertebrate hosts. Inferred rates of gene gain were heterogeneous, with both taxonomic and temporal biases. Nonetheless, the average rate of gene gain was approximately 3.5 x 10(-7) genes gained per year, which is an order of magnitude higher than the nucleotide mutation rate for these large DNA viruses.  相似文献   

17.
The complete genomes of living organisms have provided much information on their phylogenetic relationships. Similarly, the complete genomes of chloroplasts have helped to resolve the evolution of this organelle in photosynthetic eukaryotes. In this paper we propose an alternative method of phylogenetic analysis using compositional statistics for all protein sequences from complete genomes. This new method is conceptually simpler than and computationally as fast as the one proposed by Qi et al. (2004b) and Chu et al. (2004). The same data sets used in Qi et al. (2004b) and Chu et al. (2004) are analyzed using the new method. Our distance-based phylogenic tree of the 109 prokaryotes and eukaryotes agrees with the biologists tree of life based on 16S rRNA comparison in a predominant majority of basic branching and most lower taxa. Our phylogenetic analysis also shows that the chloroplast genomes are separated to two major clades corresponding to chlorophytes s.l. and rhodophytes s.l. The interrelationships among the chloroplasts are largely in agreement with the current understanding on chloroplast evolution.Reviewing Editor: Dr. John Oakeshott  相似文献   

18.
The current classification of parvoviruses is based on virus host range and helper virus dependence, while little data on evolutionary relationships among viruses are available. We identified and analyzed 472 sequences of parvoviruses, among which there were (virtually) full-length genomes of all 41 viruses currently recognized as individual species within the family Parvoviridae. Our phylogenetic analysis of full-length genomes as well as open reading frames distinguished three evolutionary groups of parvoviruses from vertebrates: (i) the human helper-dependent adeno-associated virus (AAV) serotypes 1 to 6 and the autonomous avian parvoviruses; (ii) the bovine, chipmunk, and autonomous primate parvoviruses, including human viruses B19 and V9; and (iii) the parvoviruses from rodents (except for chipmunks), carnivores, and pigs. Each of these three evolutionary groups could be further subdivided, reflecting both virus-host coevolution and multiple cross-species transmissions in the evolutionary history of parvoviruses. No parvoviruses from invertebrates clustered with vertebrate parvoviruses. Our analysis provided evidence for negative selection among parvoviruses, the independent evolution of their genes, and recombination among parvoviruses from rodents. The topology of the phylogenetic tree of autonomous human and simian parvoviruses matched exactly the topology of the primate family tree, as based on the analysis of primate mitochondrial DNA. Viruses belonging to the AAV group were not evolutionarily linked to other primate parvoviruses but were linked to the parvoviruses of birds. The two lineages of human parvoviruses may have resulted from independent ancient zoonotic infections. Our results provide an argument for reclassification of Parvovirinae based on evolutionary relationships among viruses.  相似文献   

19.
Primates, the mammalian order including our own species, comprise 480 species in 78 genera. Thus, they represent the third largest of the 18 orders of eutherian mammals. Although recent phylogenetic studies on primates are increasingly built on molecular datasets, most of these studies have focused on taxonomic subgroups within the order. Complete mitochondrial (mt) genomes have proven to be extremely useful in deciphering within-order relationships even up to deep nodes. Using 454 sequencing, we sequenced 32 new complete mt genomes adding 20 previously not represented genera to the phylogenetic reconstruction of the primate tree. With 13 new sequences, the number of complete mt genomes within the parvorder Platyrrhini was widely extended, resulting in a largely resolved branching pattern among New World monkey families. We added 10 new Strepsirrhini mt genomes to the 15 previously available ones, thus almost doubling the number of mt genomes within this clade. Our data allow precise date estimates of all nodes and offer new insights into primate evolution. One major result is a relatively young date for the most recent common ancestor of all living primates which was estimated to 66-69 million years ago, suggesting that the divergence of extant primates started close to the K/T-boundary. Although some relationships remain unclear, the large number of mt genomes used allowed us to reconstruct a robust primate phylogeny which is largely in agreement with previous publications. Finally, we show that mt genomes are a useful tool for resolving primate phylogenetic relationships on various taxonomic levels.  相似文献   

20.
A phylogenetic 'tree of life' has been constructed based on the observed presence and absence of families of protein-encoding genes observed in 11 complete genomes of free-living microorganisms. Past attempts to reconstruct the evolutionary relation-ships of microorganisms have been limited to sets of genes rather than complete genomes. Despite apparent rampant lateral gene transfer among microorganisms, these results indicate a single robust underlying evolutionary history for these organisms. Broadly, the tree produced is very similar to the small subunit rRNA tree although several additional phylogenetic relationships appear to be resolved, including the relationship of Archaeoglobus to the methanogens studied. This result is in contrast to notions that a robust phylogenetic reconstruction of microorganisms is impossible due to their genomes being composed of an incomprehensible amalgam of genes with complicated histories and suggests that this style of genome-wide phylogenetic analysis could become an important method for studying the ancient diversification of life on Earth. Analyses using informational and operational subsets of the genes showed that this 'tree of life' is not dependent on the phylogenetically more consistent informational genes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号