首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
直系同源(orthology)是指由于物种形成事件而享有共同祖先的基因之间的关系,直系同源基因之间通常具有相似的结构和生物学功能.由于基因组和转录组序列的快速积累,精确的识别直系同源基因有助于功能基因的注释,比较和进化基因组学研究.综述了现有的识别直系同源基因的主要方法,并列举了由此构建的数据库.这些方法可以归纳为三大类,第一类是基于序列相似性的方法,具有识别速度快以及灵敏度高等优点;第二类是基于构建系统发育树的方法,具有准确性高和信息量大等优点;第三类是将上述两种方法结合起来的混合方法,更好地平衡了灵敏性和准确性.最后总结了识别过程所面临的问题.  相似文献   

2.
纤毛虫分子系统发育学的研究进展   总被引:6,自引:0,他引:6  
在回顾纤毛虫分子系统发育学产生发展历史的基础上,介绍了随着20年中RFLP、RAPD和DNA序列分析等分子生物学技术作为该学科的主要研究方法在种群遗传多样性与进化、种上阶元系统发育学两方面取得的研究成果和近期研究进展,最后在探讨纤毛虫分子系统发育学存在的一些问题和解决方法的同时,预测了纤毛虫分子系统发育学今后将极大地推动真核生物的起源与进化,内共生等重要生物进化问题的研究。  相似文献   

3.
《遗传》2016,(2)
同源重组(Homologous recombination)是塑造细菌群体多样性的重要原因之一。遗传物质通过同源重组在细菌不同种系间进行水平转移,打乱了克隆繁殖形成的竖向系统发育结构,从而为系统发育重建和种群结构判定带来困难。本文讨论了同源重组对系统发育分析和进化研究的影响,从实际应用的角度对量化重组程度和鉴定重组事件的常用软件及方法进行了综述,归纳了各软件工具和模型方法的优缺点,旨在对细菌重组分析和种群进化研究有所借鉴。  相似文献   

4.
利用28S rDNA D1部分基因序列对直突摇蚊亚科代表性属级阶元进行了分子系统学研究。测定了12个内群属和2种外群的28SrDNAD1片段,并结合GenBank中3个同亚科种类该基因的同源序列进行了分析。采用2种建树方法(距离邻近法NJ和最大俭约法MP)分析了直突摇蚊亚科内属级分类单元的分子系统发育关系。结果表明,滨海摇蚊属Clunio位于系统发育树的基部,与该属营海洋生活的特殊性一致。心突摇蚊属和真开氏摇蚊属互为姐妹群,流环足摇蚊属和刀突摇蚊属互为姐妹群,此结果与基于形态学的系统发育研究相结果一致。其它属间的系统发育关系因尚无前人研究而有待做进一步研究。本研究同时证明28S rDNA D1基因片段在分析摇蚊科昆虫属级及属内阶元关系上具有一定的指导意义。  相似文献   

5.
目的:基于生物信息学预测人线粒体转录终止因子3(hMTERF3)蛋白的结构与功能。方法:利用GenBank、Uniprot、ExPASy、SWISS-PROT数据库资源和不同的生物信息学软件对hMTERF3蛋白进行系统研究,包括hMTERF3的理化性质、跨膜区和信号肽、二级结构功能域、亚细胞定位、蛋白质的功能分类预测、同源蛋白质多重序列比对、系统发育树构建、三级结构同源建模。结果:软件预测hMTERF3蛋白的相对分子质量为47.97×103,等电点为8.60,不具信号肽和跨膜区;二级结构分析显示主要为螺旋和无规则卷曲,包含6个MTERF基序,三级结构预测结果与二级结构预测结果相符;亚细胞定位分析结果显示该蛋白定位于人线粒体;功能分类预测其为转运和结合蛋白,参与基因转录调控;同源蛋白质多重序列比对和进化分析显示,hMTERF3蛋白与大鼠、小鼠等哺乳动物的MTERF3蛋白具有高度同源性,在系统发育树上聚为一类。结论:hMTERF3蛋白的生物信息学分析为进一步开展对该蛋白的结构和功能的实验研究提供了理论依据。  相似文献   

6.
同源基因分为直向同源基因、横向同源基因和异源同源基因。该文对这三种同源基因进行辨析,并对直向同源基因和横向同源基因的进一步分类进行了简单介绍。  相似文献   

7.
非序列联配的序列分析方法,将序列中特定寡聚核苷酸的kmer统计频率作为特征,在序列间按特征进行比较和分析。这种方法综合考虑了所有变异类型对序列整体特征的影响,因而在组学数据分析上有独特的优势。但是,这类方法在复杂多细胞生物基因组系统发育中的适用性仍然有待检验。在本文中,我们使用基于非序列联配方法的CVTree软件,以45种哺乳动物的蛋白质组数据建立了系统发育关系NJ树,并据此探讨了哺乳动物系统发育的若干问题。在广受关注的真兽下纲四个总目的关系问题上,CVTree支持形态学的普遍结论即上兽类(Epitheria)假说。这与基于序列联配方法支持的外非洲胎盘类(Exafro-placentalia )假说不同。在哺乳动物内部目的层次上,CVTree树的结论与分子和形态所普遍接受的系统发育关系基本一致。但是在目的内部,CVTree树会有较多的差异。研究结果初步显示非序列联配方法在使用复杂多细胞生物的组学数据进行系统发育关系分析中的可行性。对非序列联配方法自身的改进及其与传统基于取代的序列联配方法之间的比较仍有待深入研究。  相似文献   

8.
基于直向同源序列的比较基因组学研究   总被引:2,自引:0,他引:2  
直向同源序列在不同的物种中具有相近甚至相同的功能、相似的调控途径, 扮演相似甚至相同的角色, 而且, 绝大多数核心生物功能就是由相当数量的直向同源基因所承担, 它是基因组序列的功能注释与分析中最可靠的选择, 其特殊的生物学特性决定: 利用直向同源序列开展比较基因组学研究, 必将为探测不同生物在进化过程中重要功能基因的出现、表达和丢失提供线索。文章从直向同源基因的基本特性、直向同源序列与比较基因组学的关系、应用直向同源序列开展比较基因组学相关研究方法、现状等展开综述。关键词: 直向同源; 比较基因组学; 生物学特性; 数据库  相似文献   

9.
10.
植物由水生走向陆生的进化过程中经历了非常复杂的演化,期间产生的大量基因的进化路线可能互不相同,因此仅仅使用系统发育树无法呈现真实的演化关系。系统发育网络图能够清楚地展示包括垂直演化和水平演化在内的复杂网状进化关系。本文选取莱茵衣藻(Chlamydomonas reinhardtii)和4种陆生植物,利用系统基因组学的方法,筛选得到1,668个一对一直系同源基因,重新构建了陆生植物的系统发育网状进化关系。结果发现,使用不同的分析策略所得到的系统发育树不同;对1,668个基因单独分析,发现存在15种不同的拓扑结构;对5个物种筛选得到的直系同源基因进行系统发育网络分析显示,在非常稳健的系统发育网络图中,仅仅5个物种就存在9个不同的分离支,暗示着非常复杂的网状进化关系;而且藻类植物与苔藓植物和石松类植物的分离支之间差异很小,这可能是产生系统发育树冲突的原因之一,也暗示着早期陆生植物发生了复杂的辐射演化。  相似文献   

11.
Reliable orthology prediction is central to comparative genomics. Although orthology is defined by phylogenetic criteria, most automated prediction methods are based on pairwise sequence comparisons. Recently, automated phylogeny-based orthology prediction has emerged as a feasible alternative for genome-wide studies.  相似文献   

12.

Background  

The transfer of functional annotations from model organism proteins to human proteins is one of the main applications of comparative genomics. Various methods are used to analyze cross-species orthologous relationships according to an operational definition of orthology. Often the definition of orthology is incorrectly interpreted as a prediction of proteins that are functionally equivalent across species, while in fact it only defines the existence of a common ancestor for a gene in different species. However, it has been demonstrated that orthologs often reveal significant functional similarity. Therefore, the quality of the orthology prediction is an important factor in the transfer of functional annotations (and other related information). To identify protein pairs with the highest possible functional similarity, it is important to qualify ortholog identification methods.  相似文献   

13.
Ortholog identification is used in gene functional annotation, species phylogeny estimation, phylogenetic profile construction and many other analyses. Bioinformatics methods for ortholog identification are commonly based on pairwise protein sequence comparisons between whole genomes. Phylogenetic methods of ortholog identification have also been developed; these methods can be applied to protein data sets sharing a common domain architecture or which share a single functional domain but differ outside this region of homology. While promiscuous domains represent a challenge to all orthology prediction methods, overall structural similarity is highly correlated with proximity in a phylogenetic tree, conferring a degree of robustness to phylogenetic methods. In this article, we review the issues involved in orthology prediction when data sets include sequences with structurally heterogeneous domain architectures, with particular attention to automated methods designed for high-throughput application, and present a case study to illustrate the challenges in this area.  相似文献   

14.
Correct orthology assignment is a critical prerequisite of numerous comparative genomics procedures, such as function prediction, construction of phylogenetic species trees and genome rearrangement analysis. We present an algorithm for the detection of non-orthologs that arise by mistake in current orthology classification methods based on genome-specific best hits, such as the COGs database. The algorithm works with pairwise distance estimates, rather than computationally expensive and error-prone tree-building methods. The accuracy of the algorithm is evaluated through verification of the distribution of predicted cases, case-by-case phylogenetic analysis and comparisons with predictions from other projects using independent methods. Our results show that a very significant fraction of the COG groups include non-orthologs: using conservative parameters, the algorithm detects non-orthology in a third of all COG groups. Consequently, sequence analysis sensitive to correct orthology assignments will greatly benefit from these findings.  相似文献   

15.
Reliable prediction of orthology is central to comparative genomics. Approaches based on phylogenetic analyses closely resemble the original definition of orthology and paralogy and are known to be highly accurate. However, the large computational cost associated to these analyses is a limiting factor that often prevents its use at genomic scales. Recently, several projects have addressed the reconstruction of large collections of high-quality phylogenetic trees from which orthology and paralogy relationships can be inferred. This provides us with the opportunity to infer the evolutionary relationships of genes from multiple, independent, phylogenetic trees. Using such strategy, we combine phylogenetic information derived from different databases, to predict orthology and paralogy relationships for 4.1 million proteins in 829 fully sequenced genomes. We show that the number of independent sources from which a prediction is made, as well as the level of consistency across predictions, can be used as reliable confidence scores. A webserver has been developed to easily access these data (http://orthology.phylomedb.org), which provides users with a global repository of phylogeny-based orthology and paralogy predictions.  相似文献   

16.
Orthology is a powerful refinement of homology that allows us to describe more precisely the evolution of genomes and understand the function of the genes they contain. However, because orthology is not concerned with genomic position, it is limited in its ability to describe genes that are likely to have equivalent roles in different genomes. Because of this limitation, the concept of 'positional orthology' has emerged, which describes the relation between orthologous genes that retain their ancestral genomic positions. In this review, we formally define this concept, for which we introduce the shorter term 'toporthology', with respect to the evolutionary events experienced by a gene's ancestors. Through a discussion of recent studies on the role of genomic context in gene evolution, we show that the distinction between orthology and toporthology is biologically significant. We then review a number of orthology prediction methods that take genomic context into account and thus that may be used to infer the important relation of toporthology.  相似文献   

17.
Accurate orthology prediction is crucial for many applications in the post-genomic era. The lack of broadly accepted benchmark tests precludes a comprehensive analysis of orthology inference. So far, functional annotation between orthologs serves as a performance proxy. However, this violates the fundamental principle of orthology as an evolutionary definition, while it is often not applicable due to limited experimental evidence for most species. Therefore, we constructed high quality "gold standard" orthologous groups that can serve as a benchmark set for orthology inference in bacterial species. Herein, we used this dataset to demonstrate 1) why a manually curated, phylogeny-based dataset is more appropriate for benchmarking orthology than other popular practices and 2) how it guides database design and parameterization through careful error quantification. More specifically, we illustrate how function-based tests often fail to identify false assignments, misjudging the true performance of orthology inference methods. We also examined how our dataset can instruct the selection of a “core” species repertoire to improve detection accuracy. We conclude that including more genomes at the proper evolutionary distances can influence the overall quality of orthology detection. The curated gene families, called Reference Orthologous Groups, are publicly available at http://eggnog.embl.de/orthobench2.  相似文献   

18.
The increasing number of sequenced genomes has prompted the development of several automated orthology prediction methods. Tests to evaluate the accuracy of predictions and to explore biases caused by biological and technical factors are therefore required. We used 70 manually curated families to analyze the performance of five public methods in Metazoa. We analyzed the strengths and weaknesses of the methods and quantified the impact of biological and technical challenges. From the latter part of the analysis, genome annotation emerged as the largest single influencer, affecting up to 30% of the performance. Generally, most methods did well in assigning orthologous group but they failed to assign the exact number of genes for half of the groups. The publicly available benchmark set (http://eggnog.embl.de/orthobench/) should facilitate the improvement of current orthology assignment protocols, which is of utmost importance for many fields of biology and should be tackled by a broad scientific community.  相似文献   

19.
There is a great need for standards in the orthology field. Users must contend with different ortholog data representations from each provider, and the providers themselves must independently gather and parse the input sequence data. These burdensome and redundant procedures make data comparison and integration difficult. We have designed two XML-based formats, SeqXML and OrthoXML, to solve these problems. SeqXML is a lightweight format for sequence records-the input for orthology prediction. It stores the same sequence and metadata as typical FASTA format records, but overcomes common problems such as unstructured metadata in the header and erroneous sequence content. XML provides validation to prevent data integrity problems that are frequent in FASTA files. The range of applications for SeqXML is broad and not limited to ortholog prediction. We provide read/write functions for BioJava, BioPerl, and Biopython. OrthoXML was designed to represent ortholog assignments from any source in a consistent and structured way, yet cater to specific needs such as scoring schemes or meta-information. A unified format is particularly valuable for ortholog consumers that want to integrate data from numerous resources, e.g. for gene annotation projects. Reference proteomes for 61 organisms are already available in SeqXML, and 10 orthology databases have signed on to OrthoXML. Adoption by the entire field would substantially facilitate exchange and quality control of sequence and orthology information.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号