首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
直系同源(orthology)是指由于物种形成事件而享有共同祖先的基因之间的关系,直系同源基因之间通常具有相似的结构和生物学功能.由于基因组和转录组序列的快速积累,精确的识别直系同源基因有助于功能基因的注释,比较和进化基因组学研究.综述了现有的识别直系同源基因的主要方法,并列举了由此构建的数据库.这些方法可以归纳为三大类,第一类是基于序列相似性的方法,具有识别速度快以及灵敏度高等优点;第二类是基于构建系统发育树的方法,具有准确性高和信息量大等优点;第三类是将上述两种方法结合起来的混合方法,更好地平衡了灵敏性和准确性.最后总结了识别过程所面临的问题.  相似文献   

2.
目的:研究百脉根(Lotus japonicus)NIN转录因子和它的旁系同源蛋白的结构歧异和功能分化。方法:从百脉根基因组中获取完全的NIN旁系同源蛋白质序列,通过生物信息学手段进行系统发育、理化性质、功能位点和蛋白质三级结构同源建模分析。结果:总共获得5个NIN旁系同源蛋白序列,其中2个属于新鉴别的成员,它们分属于两个不同的进化分支;脯氨酸含量在LjNLP3中比次高的LjNLP1增加33%,提示其具有耐旱的功能特征;功能位点分析显示NIN旁系同源蛋白之间存在差异,提示它们可能通过翻译后修饰发生了功能分化;LjNIN蛋白在进化过程中用一段α螺旋替代了LjNLP1的一段β折叠,这一差异可能导致百脉根Nin招募为根瘤感受基因。结论:初步揭示了百脉根NIN旁系同源蛋白的结构歧异与功能分化的关系,为进一步的实验研究奠定了基础。  相似文献   

3.
同源是指从共同祖先的特性遗传下来的通常带有分歧的两个特征之间的关系。同源概念组成了进化基因组学的基础并对功能基因组学有巨大作用,但基于对同源概念的不准确理解,当前对其有诸多模糊表述,因此了解其确切含义具有重要意义。本文就同源、直系同源和旁系同源的概念和性质进行综述。  相似文献   

4.
苏杰  姚杨  黄原  刘凯歌 《生物磁学》2012,(23):4552-4554,4587
同源是指从共同祖先的特性遗传下来的通常带有分歧的两个特征之间的关系。同源概念组成了进化基因组学的基础并对功能基因组学有巨大作用,但基于对同源概念的不准确理解,当前对其有诸多模糊表述,因此了解其确切含义具有重要意义。本文就同源、直系同源和旁系同源的概念和性质进行综述。  相似文献   

5.
Ma LC  Wang YR  Liu ZP 《遗传》2012,34(5):621-634
蒺藜苜蓿(Medicago truncatula G)花器官特异表达基因是参与其花器官形成与发育的重要基因。筛选蒺藜苜蓿的花器官特异表达基因,寻找这类基因在其他模式植物中的直系同源基因,并将其表达模式在不同植物间进行比较,有利于深入的理解这类基因在蒺藜苜蓿花器官发育中的功能。根据蒺藜苜蓿表达谱,并以其PISTILLAZA(PI)基因为模板,文章筛选了97个蒺藜苜蓿花器官特异表达基因(Ratio≥10,且Z≥7.9).通过同源比对,确定了这类基因在拟南芥(Arabidopsis thaliana L.)、大豆(Glycinemax L.)、百脉根(Lotusjaponicus L.)和水稻(Oryzasativa L.)中的直系同源基因。对这类基因在5种植物中的表达量、表达部位和功能进行比较,发现进化关系较近的植物,直系同源基因的表达变异较小,而进化关系较远的植物,直系同源基因的表达变异较大。进一步对表达分化较大的直系同源基因进行启动子分析,发现不同植物中直系同源基因表达模式的变化与启动子中调控元件的特性有关。  相似文献   

6.
马利超  王彦荣  刘志鹏 《遗传》2012,34(5):621-634
蒺藜苜蓿(Medicago truncatula G.)花器官特异表达基因是参与其花器官形成与发育的重要基因。筛选蒺藜苜蓿的花器官特异表达基因, 寻找这类基因在其他模式植物中的直系同源基因, 并将其表达模式在不同植物间进行比较, 有利于深入的理解这类基因在蒺藜苜蓿花器官发育中的功能。根据蒺藜苜蓿表达谱, 并以其PISTILLATA(PI)基因为模板, 文章筛选了97个蒺藜苜蓿花器官特异表达基因(Ratio≥10, 且Z≥7.9)。通过同源比对, 确定了这类基因在拟南芥(Arabidopsis thaliana L.)、大豆(Glycine max L.)、百脉根(Lotus japonicus L.)和水稻(Oryza sativa L.)中的直系同源基因。对这类基因在5种植物中的表达量、表达部位和功能进行比较, 发现进化关系较近的植物, 直系同源基因的表达变异较小, 而进化关系较远的植物, 直系同源基因的表达变异较大。进一步对表达分化较大的直系同源基因进行启动子分析, 发现不同植物中直系同源基因表达模式的变化与启动子中调控元件的特性有关。  相似文献   

7.
全基因组重复与串联重复是发生基因重复的重要机制,也是基因组和遗传系统多样化的重要动力。LRR-RLK编码富含亮氨酸重复的类受体蛋白激酶,是被子植物进化史上发生大规模扩张而形成的多基因家族。拟南芥(Arabidopsis thaliana)AtLRR-RLK包含15个亚家族,AtLRRⅧ-2是其中发生串联重复比例最高的亚家族。通过分析拟南芥、杨树(Populustrichocarpa)、葡萄(Vitis vinifera)和番木瓜(Carica papaya) 4种模式植物中LRR Ⅷ-2亚家族基因的扩张及差异保留情况,结果显示, LRR Ⅷ-2在杨树中的扩张程度最高,在拟南芥和葡萄中的扩张程度居中,但在番木瓜中发生丢失。拟南芥、杨树和葡萄LRR Ⅷ-2亚家族具有旁系同源基因对,但在番木瓜中未发现旁系同源基因。除杨树中的1对旁系同源基因外, 4种模式植物中LRR Ⅷ-2亚家族的旁系和直系同源基因都受到较强的纯化选择作用。对LRR Ⅷ-2亚家族进化历史的深入分析有助于理解基因重复在植物进化中的作用和意义,可为预测同源基因功能及解析其它基因家族进化历史提供参考。  相似文献   

8.
基因组功能预测的进化印记方法   总被引:7,自引:1,他引:6  
改善基因组功能预测方案是目前功能基因组学的迫切问题,生物进化历程会在分子序列上留下相应进化印记-直系同源簇的特异模体,在这一生物学事实的基础上,提出了一个新的基因缚功能预测方法,首先利用进化分析方法构建直系同源簇,再找到各直系同源簇的功能模体,这样可以形成特异的功能模体库,未知基因的功能预测可望通过搜索该功能模体库而得以高效,准确地完成,对5个家族的检验初步证实该方案是可行的。  相似文献   

9.
闫晨阳  陈赢男 《植物学报》2020,55(4):442-456
全基因组重复与串联重复是发生基因重复的重要机制, 也是基因组和遗传系统多样化的重要动力。LRR-RLK编码富含亮氨酸重复的类受体蛋白激酶, 是被子植物进化史上发生大规模扩张而形成的多基因家族。拟南芥(Arabidopsis thaliana) AtLRR-RLK包含15个亚家族, AtLRR VIII-2是其中发生串联重复比例最高的亚家族。通过分析拟南芥、杨树(Populus trichocarpa)、葡萄(Vitis vinifera)和番木瓜(Carica papaya) 4种模式植物中LRR VIII-2亚家族基因的扩张及差异保留情况, 结果显示, LRR VIII-2在杨树中的扩张程度最高, 在拟南芥和葡萄中的扩张程度居中, 但在番木瓜中发生丢失。拟南芥、杨树和葡萄LRR VIII-2亚家族具有旁系同源基因对, 但在番木瓜中未发现旁系同源基因。除杨树中的1对旁系同源基因外, 4种模式植物中LRR VIII-2亚家族的旁系和直系同源基因都受到较强的纯化选择作用。对LRR VIII-2亚家族进化历史的深入分析有助于理解基因重复在植物进化中的作用和意义, 可为预测同源基因功能及解析其它基因家族进化历史提供参考。  相似文献   

10.
并系同源(paralog)和直系同源(ortholog)是物种进化过程中产生的两种基本的同源序列类型.目前判断ortholog的方法已经基本确立,而paralog的判断却还没有统一的标准.番茄全基因组测序正在进行中,利用GenBank中已有的番茄BAC序列进行一系列不同参数下的比对(blastn),根据比对结果确定了paralog预测的最佳参数,分别是E值为10-40,匹配序列长度为200 bp,序列一致率为80%.这些参数值的确定为以后在番茄BAC序列中进行paralog预测提供了适用的参数.  相似文献   

11.
Assignment of orthologous genes via genome rearrangement   总被引:1,自引:0,他引:1  
The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics. Existing methods that assign orthologs based on the similarity between DNA or protein sequences may make erroneous assignments when sequence similarity does not clearly delineate the evolutionary relationship among genes of the same families. In this paper, we present a new approach to ortholog assignment that takes into account both sequence similarity and evolutionary events at a genome level, where orthologous genes are assumed to correspond to each other in the most parsimonious evolving scenario under genome rearrangement. First, the problem is formulated as that of computing the signed reversal distance with duplicates between the two genomes of interest. Then, the problem is decomposed into two new optimization problems, called minimum common partition and maximum cycle decomposition, for which efficient heuristic algorithms are given. Following this approach, we have implemented a high-throughput system for assigning orthologs on a genome scale, called SOAR, and tested it on both simulated data and real genome sequence data. Compared to a recent ortholog assignment method based entirely on homology search (called INPARANOID), SOAR shows a marginally better performance in terms of sensitivity on the real data set because it is able to identify several correct orthologous pairs that are missed by INPARANOID. The simulation results demonstrate that SOAR, in general, performs better than the iterated exemplar algorithm in terms of computing the reversal distance and assigning correct orthologs.  相似文献   

12.
The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the gamma-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the gamma-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.  相似文献   

13.
The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the γ-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the γ-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.  相似文献   

14.
The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the γ-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the γ-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.  相似文献   

15.
16.
The quest for orthologs: finding the corresponding gene across genomes   总被引:2,自引:0,他引:2  
Orthology is a key evolutionary concept in many areas of genomic research. It provides a framework for subjects as diverse as the evolution of genomes, gene functions, cellular networks and functional genome annotation. Although orthologous proteins usually perform equivalent functions in different species, establishing true orthologous relationships requires a phylogenetic approach, which combines both trees and graphs (networks) using reliable species phylogeny and available genomic data from more than two species, and an insight into the processes of molecular evolution. Here, we evaluate the available bioinformatics tools and provide a set of guidelines to aid researchers in choosing the most appropriate tool for any situation.  相似文献   

17.
“Phylogenetic profiling” is based on the hypothesis that during evolution functionally or physically interacting genes are likely to be inherited or eliminated in a codependent manner. Creating presence–absence profiles of orthologous genes is now a common and powerful way of identifying functionally associated genes. In this approach, correctly determining orthology, as a means of identifying functional equivalence between two genes, is a critical and nontrivial step and largely explains why previous work in this area has mainly focused on using presence–absence profiles in prokaryotic species. Here, we demonstrate that eukaryotic genomes have a high proportion of multigene families whose phylogenetic profile distributions are poor in presence–absence information content. This feature makes them prone to orthology mis-assignment and unsuited to standard profile-based prediction methods. Using CATH structural domain assignments from the Gene3D database for 13 complete eukaryotic genomes, we have developed a novel modification of the phylogenetic profiling method that uses genome copy number of each domain superfamily to predict functional relationships. In our approach, superfamilies are subclustered at ten levels of sequence identity—from 30% to 100%—and phylogenetic profiles built at each level. All the profiles are compared using normalised Euclidean distances to identify those with correlated changes in their domain copy number. We demonstrate that two protein families will “auto-tune” with strong co-evolutionary signals when their profiles are compared at the similarity levels that capture their functional relationship. Our method finds functional relationships that are not detectable by the conventional presence–absence profile comparisons, and it does not require a priori any fixed criteria to define orthologous genes.  相似文献   

18.
Aevermann BD  Waters ER 《Genetica》2008,133(3):307-319
The small heat shock proteins (sHSPs) are a ubiquitous family of molecular chaperones. We have identified 18 sHSPs in the Caenorhabditis elegans genome and 20 sHSPs in the Caenorhabditis briggsae genome. Analysis of phylogenetic relationships and evolutionary dynamics of the sHSPs in these two genomes reveals a very complex pattern of evolution. The sHSPs in C. elegans and C. briggsae do not display clear orthologous relationships with other invertebrate sHSPs. But many sHSPs in C. elegans have orthologs in C. briggsae. One group of sHSPs, the HSP16s, has a very unusual evolutionary history. Although there are a number of HSP16s in both the C. elegans and C. briggsae genomes, none of the HSP16s display orthologous relationships across these two species. The HSP16s have an unusual gene pair structure and a complex evolutionary history shaped by gene duplication, gene conversion, and purifying selection. We found no evidence of recent positive selection acting on any of the sHSPs in C. elegans or in C. briggsae. There is also no evidence of functional divergence within the pairs of orthologous C. elegans and C. briggsae sHSPs. However, the evolutionary patterns do suggest that functional divergence has occurred between the sHSPs in C. elegans and C. briggsae and the sHSPs in more distantly related invertebrates.  相似文献   

19.
La D  Silver M  Edgar RC  Livesay DR 《Biochemistry》2003,42(30):8988-8998
Protein motifs represent highly conserved regions within protein families and are generally accepted to describe critical regions required for protein stability and/or function. In this comprehensive analysis, we present a robust, unique approach to identify and compare corresponding mesophilic and thermophilic sequence motifs between all orthologous proteins within 44 microbial genomes. Motif similarity is determined through global sequence alignment of mesophilic and thermophilic motif pairs, which are identified by a greedy algorithm. Our results reveal only modest correlation between motif and overall sequence similarity, highlighting the rationale of motif-based approaches in comprehensive multigenome comparisons. Conserved mutations reflect previously suggested physiochemical principles for conferring thermostability. Additionally, comparisons between corresponding mesophilic and thermophilic motif pairs provide key biochemical insights related to thermostability and can be used to test the evolutionary robustness of individual structural comparisons. We demonstrate the ability of our unique approach to provide key insights in two examples: the TATA-box binding protein and glutamate dehydrogenase families. In the latter example, conserved mutations hint at novel origins leading to structural stability differences within the hexamer structures. Additionally, we present amino acid composition data and average protein length comparisons for all 44 microbial genomes.  相似文献   

20.

Background  

Gene duplication and gene loss during the evolution of eukaryotes have hindered attempts to estimate phylogenies and divergence times of species. Although current methods that identify clusters of orthologous genes in complete genomes have helped to investigate gene function and gene content, they have not been optimized for evolutionary sequence analyses requiring strict orthology and complete gene matrices. Here we adopt a relatively simple and fast genome comparison approach designed to assemble orthologs for evolutionary analysis. Our approach identifies single-copy genes representing only species divergences (panorthologs) in order to minimize potential errors caused by gene duplication. We apply this approach to complete sets of proteins from published eukaryote genomes specifically for phylogeny and time estimation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号