首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Gene duplication is a crucial mechanism of evolutionary innovation. A substantial fraction of eukaryotic genomes consists of paralogous gene families. We assess the extent of ancestral paralogy, which dates back to the last common ancestor of all eukaryotes, and examine the origins of the ancestral paralogs and their potential roles in the emergence of the eukaryotic cell complexity. A parsimonious reconstruction of ancestral gene repertoires shows that 4137 orthologous gene sets in the last eukaryotic common ancestor (LECA) map back to 2150 orthologous sets in the hypothetical first eukaryotic common ancestor (FECA) [paralogy quotient (PQ) of 1.92]. Analogous reconstructions show significantly lower levels of paralogy in prokaryotes, 1.19 for archaea and 1.25 for bacteria. The only functional class of eukaryotic proteins with a significant excess of paralogous clusters over the mean includes molecular chaperones and proteins with related functions. Almost all genes in this category underwent multiple duplications during early eukaryotic evolution. In structural terms, the most prominent sets of paralogs are superstructure-forming proteins with repetitive domains, such as WD-40 and TPR. In addition to the true ancestral paralogs which evolved via duplication at the onset of eukaryotic evolution, numerous pseudoparalogs were detected, i.e. homologous genes that apparently were acquired by early eukaryotes via different routes, including horizontal gene transfer (HGT) from diverse bacteria. The results of this study demonstrate a major increase in the level of gene paralogy as a hallmark of the early evolution of eukaryotes.  相似文献   

2.
Two rounds of large-scale duplications are thought to have occurred in early vertebrate ancestry; this is now known as the "2R hypothesis." They have led to the constitution of subfamilies of paralogous genes. Chromosomal regions that contain present-day paralogs (paralogous regions or paralogons) have been identified in mammals. We show that sets of paralogons (PGs) can be assembled in a tentative "human genome paralogy map" that includes all autosomes and X. A total of 14 PGs, containing more than 1600 genes, were assembled in this paralogy map. Genes that belong to the same PG are coparalogs. We show that identification of coparalogy can be used (i) to broaden data on gene mapping, (ii) to identify physical gene clusters that derive from early cis-duplications, and (iii) to speculate on coevolution and coregulation of genes sharing a common structure or function (functional clusters). Thus, coparalogy analyses should parallel phylogenetic analyses and can help draw hypotheses on gene and genome evolution.  相似文献   

3.
Unlike eukaryotes, which often recruit duplicated genes into existing protein-protein interaction (PPI) networks, the low levels of gene duplication coupled with the high probability of lateral transfer of novel genes alters the manner in which PPI networks can evolve in bacteria. By inferring the PPIs present in the ancestor to contemporary Gammaproteobacteria, we were able to trace the changes in gene repertoires, and their consequences on PPI network evolution, in several bacterial lineages that have independently undergone reductions in genome size and genome contents. As genomes degrade, virtually all multi-partner proteins have lost interactors; however, the overall average number of connections increases due to the preferential elimination of proteins that interact with only one other protein partner. We also studied the effect of lateral gene transfer on PPI network evolution by analyzing the connectivity of genes that have been gained along the Escherichia coli lineage, as well as those acquired genes subsequently silenced in Shigella flexneri, since diverging from the gammaproteobacterial ancestor. The situation in PPI networks, in which newly acquired genes preferentially attach to the hubs of the network, contrasts that observed in metabolic networks, which evolve by the peripheral gain and loss of genes, and in regulatory networks, in which high connectivity increases the propensity of loss.  相似文献   

4.
Members of the Structural Maintenance of Chromosome (SMC) family have long been of interest to molecular and evolutionary biologists for their role in chromosome structural dynamics, particularly sister chromatid cohesion, condensation, and DNA repair. SMC and related proteins are found in all major groups of living organisms and share a common structure of conserved N and C globular domains separated from the conserved hinge domain by long coiled-coil regions. In eukaryotes there are six paralogous proteins that form three heterodimeric pairs, whereas in prokaryotes there is only one SMC protein that homodimerizes. From recently completed genome sequences, we have identified SMC genes from 34 eukaryotes that have not been described in previous reports. Our phylogenetic analysis of these and previously identified SMC genes supports an origin for the vertebrate meiotic SMC1 in the most recent common ancestor since the divergence from invertebrate animals. Additionally, we have identified duplicate copies due to segmental duplications for some of the SMC paralogs in plants and yeast, mainly SMC2 and SMC6, and detected evidence that duplicates of other paralogs were lost, suggesting differential evolution for these genes. Our analysis indicates that the SMC paralogs have been stably maintained at very low copy numbers, even after segmental (genome-wide) duplications. It is possible that such low copy numbers might be selected during eukaryotic evolution, although other possibilities are not ruled out.  相似文献   

5.
The availability of multiple teleost (bony fish) genomes is providing unprecedented opportunities to understand the diversity and function of gene duplication events using comparative genomics. Here we examine multiple paralogous genes of γ-glutamyl transferase (GGT) in several distantly related teleost species including medaka, stickleback, green spotted pufferfish, fugu, and zebrafish. Through mining genome databases, we have identified multiple GGT orthologs. Duplicate (paralogous) GGT sequences for GGT1 (GGT1 a and b), GGTL1 (GGTL1 a and b), and GGTL3 (GGTL3 a and b) were identified for each species. Phylogenetic analysis suggests that GGTs are ancient proteins conserved across most metazoan phyla and those paralogous GGTs in teleosts likely arose from the serial 3R genome duplication events. A third GGTL1 gene (GGTL1c) was found in green spotted pufferfish; however, this gene is not present in medaka, stickleback, or fugu. Similarly, one or both paralogs of GGTL3 appear to have been lost in green spotted pufferfish, fugu, and zebrafish. Syntenic relationships were highly maintained between duplicated teleost chromosomes, among teleosts and across ray-finned (Actinopterygii) and lobe-finned (Sarcopterygii) species. To assess subfunction partitioning, six medaka GGT genes were cloned and assessed for developmental and tissue-specific expression. On the basis of these data, we propose a modification of the "duplication-degeneration-complementation" model of subfunction partitioning where quantitative differences rather than absolute differences in gene expression are observed between gene paralogs. Our results demonstrate that multiple GGT genes have been retained within teleost genomes. Questions remain, however, regarding the functional roles of multiple GGTs in these species.  相似文献   

6.
7.
Lrp (leucine-responsive regulatory protein) plays a global regulatory role in Escherichia coli, affecting expression of dozens of operons. Numerous lrp-related genes have been identified in different bacteria and archaea, including asnC, an E. coli gene that was the first reported member of this family. Pairwise comparisons of amino acid sequences of the corresponding proteins shows an average sequence identity of only 29% for the vast majority of comparisons. By contrast, Lrp-related proteins from enteric bacteria show more than 97% amino acid identity. Is the global regulatory role associated with E. coli Lrp limited to enteric bacteria? To probe this question we investigated LrfB, an Lrp-related protein from Haemophilus influenzae that shares 75% sequence identity with E. coli Lrp (highest sequence identity among 42 sequences compared). A strain of H. influenzae having an lrfB null allele grew at the wild-type growth rate but with a filamentous morphology. A comparison of two-dimensional (2D) electrophoretic patterns of proteins from parent and mutant strains showed only two differences (comparable studies with lrp(+) and lrp E. coli strains by others showed 20 differences). The abundance of LrfB in H. influenzae, estimated by Western blotting experiments, was about 130 dimers per cell (compared to 3,000 dimers per E. coli cell). LrfB expressed in E. coli replaced Lrp as a repressor of the lrp gene but acted only to a limited extent as an activator of the ilvIH operon. Thus, although LrfB resembles Lrp sufficiently to perform some of its functions, its low abundance is consonant with a more local role in regulating but a few genes, a view consistent with the results of the 2D electrophoretic analysis. We speculate that an Lrp having a global regulatory role evolved to help enteric bacteria adapt to their ecological niches and that it is unlikely that Lrp-related proteins in other organisms have a broad regulatory function.  相似文献   

8.
Conservation of adjacency as evidence of paralogous operons   总被引:5,自引:2,他引:3  
  相似文献   

9.
Both mean genomes size and the variance in genome size among species are smaller on average in birds (class Aves) than in the other tetrapod classes. In order to test whether loss of protein-coding genes has contributed to genome size reduction in birds, we compared the chicken genome and five mammalian genomes. Numbers of members (paralogs) were significantly lower in the chicken gene families than in the corresponding mammalian families. Phylogenetic analyses of chicken, mammal, and fish paralogs supported the hypothesis that chicken-specific loss of paralogs occurred much more frequently than mammal-specific gene duplications. Moreover, the phylogenetic analyses supported the hypothesis that a substantial majority of the paralogs lost in chicken originated from duplications prior to the most recent common ancestor of tetrapods and bony fishes. In addition to loss of paralogs, numerous gene families present in the mammalian genomes were missing in the chicken genome; over 1,000 of these families were found in bony fishes, implying presence of the family in the tetrapod ancestor. In the set of families with more members on average in the mammals than in the chicken, immune system function was associated with a greater degree of gene family size reduction in the chicken, consistent with other evidence that immune system gene families have become particularly compact in birds.  相似文献   

10.
The relative rates of change for eight sets of ubiquitous proteins were determined by a test in which anciently duplicated paralogs are used to root the universal tree and distances are calculated between each taxonomic group and the last common ancestor. The sets included ATPase subunits, elongation factors, signal recognition particle and its receptor, three sets of tRNA synthetases, transcarbamoylases, and an internal duplication in carbamoyl phosphate synthase. In each case phylogenetic trees were constructed and the distances determined for all pairs. Taken over the period of time since their last common ancestor, average evolutionary rates are remarkably similar for Bacteria and Eukarya, but Archaea exhibit a significantly slower average rate. Received: 30 December 1999 / Accepted: 5 April 2000  相似文献   

11.
Cytosolic ribosomes are among the largest multisubunit cellular complexes. Arabidopsis thaliana ribosomes consist of 79 different ribosomal proteins (r-proteins) that each are encoded by two to six (paralogous) genes. It is unknown whether the paralogs are incorporated into the ribosome and whether the relative incorporation of r-protein paralogs varies in response to environmental cues. Immunopurified ribosomes were isolated from A. thaliana rosette leaves fed with sucrose. Trypsin digested samples were analyzed by qTOF-LC-MS using both MS(E) and classical MS/MS. Peptide features obtained by using these two methods were identified using MASCOT and Proteinlynx Global Server searching the theoretical sequences of A. thaliana proteins. The A. thaliana genome encodes 237 r-proteins and 69% of these were identified with proteotypic peptides for most of the identified proteins. These r-proteins were identified with average protein sequence coverage of 32% observed by MS(E) . Interestingly, the analysis shows that the abundance of r-protein paralogs in the ribosome changes in response to sucrose feeding. This is particularly evident for paralogous RPS3aA, RPS5A, RPL8B, and RACK1 proteins. These results show that protein synthesis in the A. thaliana cytosol involves a heterogeneous ribosomal population. The implications of these findings in the regulation of translation are discussed.  相似文献   

12.
The enteric bacterium Escherichia coli synthesizes cobalamin (coenzyme B12) only when provided with the complex intermediate cobinamide. Three cobalamin biosynthetic genes have been cloned from Escherichia coli K-12, and their nucleotide sequences have been determined. The three genes form an operon (cob) under the control of several promoters and are induced by cobinamide, a precursor of cobalamin. The cob operon of E. coli comprises the cobU gene, encoding the bifunctional cobinamide kinase-guanylyltransferase; the cobS gene, encoding cobalamin synthetase; and the cobT gene, encoding dimethylbenzimidazole phosphoribosyltransferase. The physiological roles of these sequences were verified by the isolation of Tn10 insertion mutations in the cobS and cobT genes. All genes were named after their Salmonella typhimurium homologs and are located at the corresponding positions on the E. coli genetic map. Although the nucleotide sequences of the Salmonella cob genes and the E. coli cob genes are homologous, they are too divergent to have been derived from an operon present in their most recent common ancestor. On the basis of comparisons of G+C content, codon usage bias, dinucleotide frequencies, and patterns of synonymous and nonsynonymous substitutions, we conclude that the cob operon was introduced into the Salmonella genome from an exogenous source. The cob operon of E. coli may be related to cobalamin synthetic genes now found among non-Salmonella enteric bacteria.  相似文献   

13.
The enlargement of the genome size and the decrease in genome compactness with increase in the number and size of introns is a general pattern during the evolution of eukaryotes. Among the possible mechanisms for modifying intron size, it has been suggested that the insertion of transposable elements might have an important role in driving intron evolution. The analysis of large portions of the human genome demonstrated that a relatively recent (50 to 100 MYA) accumulation of transposable elements appears to be biased, favoring a preferential insertion of LINE1 transposons into sex chromosomes rather than into autosomes. In the present work, the effect of chromosomal location on the increase in size of introns was evaluated with a comparative analysis performed on pairs of human paralogous genes, one located on the X chromosome and the second on an autosome. A phylogenetic analysis was also performed on the X-encoded proteins and their paralogs to confirm orthology-paralogy and to approximately estimate the time of gene duplication. Statistical analysis of total intron length for each pair of paralogous genes provided no evidence for a larger size of introns in the gene copies located on the X chromosome. On the opposite, introns of autosomal genes were found to be significantly longer than introns of their X-linked paralogs. Likewise, LINE1 elements were not significantly more frequent in X-chromosome introns, whereas the frequency of SINE elements showed a marginally significant bias toward autosomal introns.  相似文献   

14.
Based on fish genomic studies, we review mechanisms of divergence in duplicated genes (paralogs), resulted in small (“subfunctionalization”) or large (“neofunctionalization”) changes in paralogs. Gene divergence occurs due to several processes, such as non-synonymous substitutions, exon-intron structure rearrangement, and alterations in regulatory regions, which cause differential temporal or spatial expression of paralogous gene copies during ontogenesis.  相似文献   

15.
The pairs of nitrogen fixation genes nifDK and nifEN encode for the α and β subunits of nitrogenase and for the two subunits of the NifNE protein complex, involved in the biosynthesis of the FeMo cofactor, respectively. Comparative analysis of the amino acid sequences of the four NifD, NifK, NifE, and NifN in several archaeal and bacterial diazotrophs showed extensive sequence similarity between them, suggesting that their encoding genes constitute a novel paralogous gene family. We propose a two-step model to reconstruct the possible evolutionary history of the four genes. Accordingly, an ancestor gene gave rise, by an in-tandem paralogous duplication event followed by divergence, to an ancestral bicistronic operon; the latter, in turn, underwent a paralogous operon duplication event followed by evolutionary divergence leading to the ancestors of the present-day nifDK and nifEN operons. Both these paralogous duplication events very likely predated the appearance of the last universal common ancestor. The possible role of the ancestral gene and operon in nitrogen fixation is also discussed. Received: 21 June 1999 / Accepted: 1 March 2000  相似文献   

16.
The bacterial YbaK protein is a Cys-tRNAPro and Cys-tRNA Cys deacylase   总被引:1,自引:0,他引:1  
Bacterial prolyl-tRNA synthetases and some smaller paralogs, YbaK and ProX, can hydrolyze misacylated Cys-tRNA Pro or Ala-tRNA Pro. To assess the significance of this quality control editing reaction in vivo, we tested Escherichia coli ybaK for its ability to suppress the E. coli thymidylate synthase thyA:146CCA missense mutant strain, which requires Cys-tRNA(Pro) for growth in the absence of thymine. Missense suppression was observed in a ybaK deletion background, suggesting that YbaK functions as a Cys-tRNA Pro deacylase in vivo. In vitro studies with the full set of 20 E. coli aminoacyl-tRNAs revealed that the Haemophilus influenzae and E. coli YbaK proteins are moderately general aminoacyl-tRNA deacylases that preferentially hydrolyze Cys-tRNA Pro and Cys-tRNA Cys and are also weak deacylases that cleave Gly-tRNA, Ala-tRNA, Ser-tRNA, Pro-tRNA, and Met-tRNA. The ProX protein acted as an aminoacyl-tRNA deacylase that cleaves preferentially Ala-tRNA and Gly-tRNA. The potential of H. influenzae YbaK to hydrolyze in vivo correctly charged Cys-tRNA Cys was tested in E. coli strain X2913 (ybaK+). Overexpression of H. influenzae ybaK decreased the in vivo ratio of Cys-tRNA Cys to tRNA Cys from 65 to 35% and reduced the growth rate of strain X2913 by 30% in LB medium. These data suggest that YbaK-mediated hydrolysis of aminoacyl-tRNA can influence cell growth.  相似文献   

17.
The ancient duplication of the Saccharomyces cerevisiae genome and subsequent massive loss of duplicated genes is apparent when it is compared to the genomes of related species that diverged before the duplication event. To learn more about the evolutionary effects of the duplication event, we compared the S. cerevisiae genome to other Saccharomyces genomes. We demonstrate that the whole genome duplication occurred before S. castellii diverged from S. cerevisiae. In addition to more accurately dating the duplication event, this finding allowed us to study the effects of the duplication on two separate lineages. Analyses of the duplication regions of the genomes indicate that most of the duplicated genes (approximately 85%) were lost before the speciation. Only a small amount of paralogous gene loss (4-6%) occurred after speciation. On the other hand, S. castellii appears to have lost several hundred genes that were not retained as duplicated paralogs. These losses could be related to genomic rearrangements that reduced the number of chromosomes from 16 to 9. In addition to S. castellii, other Saccharomyces sensu lato species likely diverged from S. cerevisiae after the duplication. A thorough analysis of these species will likely reveal other important outcomes of the whole genome duplication.  相似文献   

18.
Evans BJ 《Genetics》2007,176(2):1119-1130
Allopolyploid species form through the fusion of two differentiated genomes and, in the earliest stages of their evolution, essentially all genes in the nucleus are duplicated. Because unique mutations occur in each ancestor prior to allopolyploidization, duplicate genes in these species potentially are not interchangeable, and this could influence their genetic fates. This study explores evolution and expression of a simple duplicated complex--a heterodimer between RAG1 and RAG2 proteins in clawed frogs (Xenopus). Results demonstrate that copies of RAG1 degenerated in different polyploid species in a phylogenetically biased fashion, predominately in only one lineage of closely related paralogs. Surprisingly, as a result of an early deletion of one RAG2 paralog, it appears that in many species RAG1/RAG2 heterodimers are composed of proteins that were encoded by unlinked paralogs. If the tetraploid ancestor of extant species of Xenopus arose through allopolyploidization and if recombination between paralogs was rare, then the genes that encode functional RAG1 and RAG2 proteins in many polyploid species were each ultimately inherited from different diploid progenitors. These observations are consistent with the notion that ancestry can influence the fate of duplicate genes millions of years after duplication, and they uncover a dimension of natural selection in allopolyploid genomes that is distinct from other genetic phenomena associated with polyploidization or segmental duplication.  相似文献   

19.
20.
The organization of the fatty acid synthetic genes of Haemophilus influenzae Rd is remarkably similar to that of the paradigm organism, Escherichia coli K-12, except that no homologue of the E. coli fabF gene is present. This finding is unexpected, since fabF is very widely distributed among bacteria and is thought to be the generic 3-ketoacyl-acyl carrier protein (ACP) synthase active on long-chain-length substrates. However, H. influenzae Rd contains a homologue of the E. coli fabB gene, which encodes a 3-ketoacyl-ACP synthase required for unsaturated fatty acid synthesis, and it seemed possible that the H. influenzae FabB homologue might have acquired the functions of FabF. E. coli mutants lacking fabF function are unable to regulate the compositions of membrane phospholipids in response to growth temperature. We report in vivo evidence that the enzyme encoded by the H. influenzae fabB gene has properties essentially identical to those of E. coli FabB and lacks FabF activity. Therefore, H. influenzae grows without FabF function. Moreover, as predicted from studies of the E. coli fabF mutants, H. influenzae is unable to change the fatty acid compositions of its membrane phospholipids with growth temperature. We also demonstrate that the fabB gene of Vibrio cholerae El Tor N16961 does not contain a frameshift mutation as was previously reported.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号