首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
A common assumption in comparative genomics is that orthologous genes share greater functional similarity than do paralogous genes (the "ortholog conjecture"). Many methods used to computationally predict protein function are based on this assumption, even though it is largely untested. Here we present the first large-scale test of the ortholog conjecture using comparative functional genomic data from human and mouse. We use the experimentally derived functions of more than 8,900 genes, as well as an independent microarray dataset, to directly assess our ability to predict function using both orthologs and paralogs. Both datasets show that paralogs are often a much better predictor of function than are orthologs, even at lower sequence identities. Among paralogs, those found within the same species are consistently more functionally similar than those found in a different species. We also find that paralogous pairs residing on the same chromosome are more functionally similar than those on different chromosomes, perhaps due to higher levels of interlocus gene conversion between these pairs. In addition to offering implications for the computational prediction of protein function, our results shed light on the relationship between sequence divergence and functional divergence. We conclude that the most important factor in the evolution of function is not amino acid sequence, but rather the cellular context in which proteins act.  相似文献   

4.
Gene duplication is a crucial mechanism of evolutionary innovation. A substantial fraction of eukaryotic genomes consists of paralogous gene families. We assess the extent of ancestral paralogy, which dates back to the last common ancestor of all eukaryotes, and examine the origins of the ancestral paralogs and their potential roles in the emergence of the eukaryotic cell complexity. A parsimonious reconstruction of ancestral gene repertoires shows that 4137 orthologous gene sets in the last eukaryotic common ancestor (LECA) map back to 2150 orthologous sets in the hypothetical first eukaryotic common ancestor (FECA) [paralogy quotient (PQ) of 1.92]. Analogous reconstructions show significantly lower levels of paralogy in prokaryotes, 1.19 for archaea and 1.25 for bacteria. The only functional class of eukaryotic proteins with a significant excess of paralogous clusters over the mean includes molecular chaperones and proteins with related functions. Almost all genes in this category underwent multiple duplications during early eukaryotic evolution. In structural terms, the most prominent sets of paralogs are superstructure-forming proteins with repetitive domains, such as WD-40 and TPR. In addition to the true ancestral paralogs which evolved via duplication at the onset of eukaryotic evolution, numerous pseudoparalogs were detected, i.e. homologous genes that apparently were acquired by early eukaryotes via different routes, including horizontal gene transfer (HGT) from diverse bacteria. The results of this study demonstrate a major increase in the level of gene paralogy as a hallmark of the early evolution of eukaryotes.  相似文献   

5.
Although a quantitative relationship between sequence similarity and structural similarity has long been established, little is known about the impact of orthology on the relationship between protein sequence and structure. Among homologs, orthologs (derived by speciation) more frequently have similar functions than paralogs (derived by duplication). Here, we hypothesize that an orthologous pair will tend to exhibit greater structural similarity than a paralogous pair at the same level of sequence similarity. To test this hypothesis, we used 284,459 pairwise structure‐based alignments of 12,634 unique domains from SCOP as well as orthology and paralogy assignments from OrthoMCL DB. We divided the comparisons by sequence identity and determined whether the sequence‐structure relationship differed between the orthologs and paralogs. We found that at levels of sequence identity between 30 and 70%, orthologous domain pairs indeed tend to be significantly more structurally similar than paralogous pairs at the same level of sequence identity. An even larger difference is found when comparing ligand binding residues instead of whole domains. These differences between orthologs and paralogs are expected to be useful for selecting template structures in comparative modeling and target proteins in structural genomics.  相似文献   

6.
As the result of the EUROIMAGE Consortium sequencing project, we have isolated and characterized a novel gene on chromosome 15, TM6SF1. It encodes a 370 amino acid product with enhanced expression in spleen, testis and peripheral blood leukocytes. We have identified another gene, paralogous to TM6SF1 on chromosome 19p12, TM6SF2, with an overall similarity of 68% and 52% identity at the protein level. This conservation has led us to uncover a series of eleven genes in 19p13.3-->p12 with close homology to genes in 15q24--> q26. The percentage of sequence similarity between each paralogous pair of genes at the protein level ranges between 43 and 89%. A partial conservation of synteny with mouse chromosomes 7, 8 and 9 is also observed. The corresponding orthologous genes in mouse of human TM6SF1 and TM6SF2 show a high degree of amino acid sequence conservation.  相似文献   

7.
Dlx homeobox genes of vertebrates are often organised as physically linked pairs in which the two genes are transcribed convergently (tail-to-tail arrangement). Three such Dlx pairs have been found in mouse, human, and zebrafish and are thought to have originated from the duplication of an ancestral gene pair. These pairs include Dlx1/Dlx2, Dlx7/Dlx3, and Dlx6/Dlx5 (the zebrafish orthologue of Dlx5 is named dlx4). Expression patterns of physically linked Dlx genes overlap extensively. Furthermore, orthologous Dlx genes often show highly similar expression patterns. We analysed Dlx expression during the gastrula and early somitogenesis of the mouse and zebrafish. It was found that expression of the mouse Dlx6 gene takes place in the rostral ectoderm and presumptive olfactory and otic placodes with patterns similar to the previously reported expression of the physically linked Dlx5 gene. However, we observed only very weak expression of the mouse Dlx3 gene at the same stage. This contrasts with the expression of dlx genes in zebrafish where dlx3 and dlx7, but not dlx4 and dlx6 are expressed during gastrulation in the rostral ectoderm and presumptive placodes. Thus, Dlx expression patterns at early stages are better conserved between paralogous pairs of physically linked genes than between orthologous pairs. This suggests that early expression of Dlx genes existed prior to the duplications that led to the multiple pairs of physically linked genes but was differentially conserved in different paralogs in zebrafish and mice.  相似文献   

8.
9.
On the incidence of intron loss and gain in paralogous gene families   总被引:3,自引:0,他引:3  
Understanding gene duplication and gene structure evolution are fundamental goals of molecular evolutionary biology. A previous study by Babenko et al. (2004. Prevalence of intron gain over intron loss in the evolution of paralogous gene families. Nucleic Acids Res. 32:3724-3733) employed Dollo parsimony to infer spliceosomal intron losses and gains in paralogous gene families and concluded that there was a general excess of gains over losses. This result contrasts with patterns in orthologous genes, in which most lineages show an excess of intron losses over gains, suggesting the possibility of fundamentally different modes of intron evolution between orthologous and paralogous genes. We further studied the data and found a low level of intron position conservation with outgroups, and this led to problems with using Dollo parsimony to analyze the data. Statistical reanalysis of the data suggests, instead, that intron losses have outnumbered intron gains in paralogous gene families.  相似文献   

10.
11.
Prediction of operons in microbial genomes   总被引:28,自引:7,他引:21       下载免费PDF全文
  相似文献   

12.
Gene duplication provides the opportunity for subsequent refinement of distinct functions of the duplicated copies. Either through changes in coding sequence or changes in regulatory regions, duplicate copies appear to obtain new or tissue-specific functions. If this divergence were driven by natural selection, we would expect duplicated copies to have differentiated patterns of substitutions. We tested this hypothesis using genes that duplicated before the human/mouse split and whose orthologous relations were clear. The null hypothesis is that the number of amino acid changes between humans and mice was distributed similarly across different paralogs. We used a method modified from Tang and Lewontin to detect heterogeneity in the amino acid substitution pattern between those different paralogs. Our results show that many of the paralogous gene pairs appear to be under differential selection in the human/mouse comparison. The properties that led to diversification appear to have arisen before the split of the human and mouse lineages. Further study of the diverged genes revealed insights regarding the patterns of amino acid substitution that resulted in differences in function and/or expression of these genes. This approach has utility in the study of newly identified members of gene families in genomewide data mining and for contrasting the merits of alternative hypotheses for the evolutionary divergence of function of duplicated genes.  相似文献   

13.
14.
It has been shown that gene body DNA methylation is associated with gene expression. However, whether and how deviation of gene body DNA methylation between duplicate genes can influence their divergence remains largely unexplored. Here, we aim to elucidate the potential role of gene body DNA methylation in the fate of duplicate genes. We identified paralogous gene pairs from Arabidopsis and rice (Oryza sativa ssp. japonica) genomes and reprocessed their single-base resolution methylome data. We show that methylation in paralogous genes nonlinearly correlates with several gene properties including exon number/gene length, expression level and mutation rate. Further, we demonstrated that divergence of methylation level and pattern in paralogs indeed positively correlate with their sequence and expression divergences. This result held even after controlling for other confounding factors known to influence the divergence of paralogs. We observed that methylation level divergence might be more relevant to the expression divergence of paralogs than methylation pattern divergence. Finally, we explored the mechanisms that might give rise to the divergence of gene body methylation in paralogs. We found that exonic methylation divergence more closely correlates with expression divergence than intronic methylation divergence. We show that genomic environments (e.g., flanked by transposable elements and repetitive sequences) of paralogs generated by various duplication mechanisms are associated with the methylation divergence of paralogs. Overall, our results suggest that the changes in gene body DNA methylation could provide another avenue for duplicate genes to develop differential expression patterns and undergo different evolutionary fates in plant genomes.  相似文献   

15.
16.
17.
Operons are clusters of genes that are co-regulated from a common promoter. Operons are typically associated with prokaryotes, although a small number of eukaryotes have been shown to possess them. Among metazoans, operons have been extensively characterized in the nematode Caenorhabditis elegans in which ~15% of the total genes are organized into operons. The most recent genome assembly for the ascidian Ciona intestinalis placed ~20% of the genes (2909 total) into 1310 operons. The majority of these operons are composed of two genes, while the largest are composed of six. Here is reported a computational analysis of the genes that comprise the Ciona operons. Gene ontology (GO) terms were identified for about two-thirds of the operon-encoded genes. Using the extensive collection of public EST libraries, estimates of temporal patterns of gene expression were generated for the operon-encoded genes. Lastly, conservation of operons was analyzed by determining how many operon-encoded genes were present in the ascidian Ciona savignyi and whether these genes were organized in orthologous operons. Over 68% of the operon-encoded genes could be assigned one or more GO terms and 697 of the 1310 operons contained genes in which all genes had at least one GO term. Of these 697 operons, GO terms were shared by all of the genes within 146 individual operons, suggesting that most operons encode genes with unrelated functions. An analysis of operon gene expression from nine different EST libraries indicated that for 587 operons, all of the genes that comprise an individual operon were expressed together in at least one EST library, suggesting that these genes may be co-regulated. About 50% (74/146) of the operons with shared GO terms also showed evidence of gene co-regulation. Comparisons with the C. savignyi genome identified orthologs for 1907 of 2909 operon genes. About 38% (504/1310) of the operons are conserved between the two Ciona species. These results suggest that like C. elegans, operons in Ciona are comprised of a variety of genes that are not necessarily related in function. The genes in only 50% of the operons appear to be co-regulated, suggesting that more complex gene regulatory mechanisms are likely operating.  相似文献   

18.
 We report on a new zebrafish T-box-containing gene, tbx16. It encodes a message that is first detected throughout the blastoderm soon after the initiation of zygotic gene expression. Following gastrulation, expression becomes restricted to paraxial mesoderm and later primarily to the developing tail bud. To gain an evolutionary prospective on the potential function of this gene, we have analyzed its phylogenetic relationships to known T-box genes from other species. Zebrafish tbx16 is likely orthologous to the chicken Tbx6L and Xenopus Xombi/Antipodean/Brat/VegT genes. Our analysis also shows that zebrafish tbx6 and mouse Tbx6 genes are paralogous to zebrafish tbx16. We present evidence which argues, that despite the same name and similar expression, zebrafish tbx6 and mouse Tbx6 genes are not orthologous to each other but instead represent relatively distant paralogs. The expression patterns of all genes are discussed in the light of their evolutionary relationships. Received: 27 November 1997 / Accepted: 27 January 1998  相似文献   

19.
The cryptic asc (previous called "SAC") operon of Escherichia coli K12 has been completely sequenced. It encodes a repressor (ascG); a PTS enzyme IIasc for the transport of arbutin, salicin, and cellobiose (ascF); and a phospho-beta-glucosidase that hydrolyzes the sugars which are phosphorylated during transport (ascB). ascG and ascFB are transcribed from divergent promoters. The cryptic operon is activated by the insertion of IS186 into the ascG (repressor) gene. The ascFB genes are paralogous to the cryptic bglFB genes, and ascG is paralogous to galR. The duplications that gave rise to these paralogous genes are estimated to have occurred approximately 320 Mya, a time that predates the divergence of E. coli and Salmonella typhimurium.  相似文献   

20.
The enlargement of the genome size and the decrease in genome compactness with increase in the number and size of introns is a general pattern during the evolution of eukaryotes. Among the possible mechanisms for modifying intron size, it has been suggested that the insertion of transposable elements might have an important role in driving intron evolution. The analysis of large portions of the human genome demonstrated that a relatively recent (50 to 100 MYA) accumulation of transposable elements appears to be biased, favoring a preferential insertion of LINE1 transposons into sex chromosomes rather than into autosomes. In the present work, the effect of chromosomal location on the increase in size of introns was evaluated with a comparative analysis performed on pairs of human paralogous genes, one located on the X chromosome and the second on an autosome. A phylogenetic analysis was also performed on the X-encoded proteins and their paralogs to confirm orthology-paralogy and to approximately estimate the time of gene duplication. Statistical analysis of total intron length for each pair of paralogous genes provided no evidence for a larger size of introns in the gene copies located on the X chromosome. On the opposite, introns of autosomal genes were found to be significantly longer than introns of their X-linked paralogs. Likewise, LINE1 elements were not significantly more frequent in X-chromosome introns, whereas the frequency of SINE elements showed a marginally significant bias toward autosomal introns.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号