首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In the sequences released by the Arabidopsis Genome Initiative (AGI), we discovered a new and unexpectedly large family of orphan genes (127 genes by 01.08.99), named AtPCMP. The distribution of the AtPCMP genes on the five chromosomes suggests that the genome of Arabidopsis thaliana contains more than 200 genes of this family (1% of the whole genome). The deduced AtPCMP proteins are characterized by a surprising combinatorial organization of sequence motifs. The amino-terminal domain is made of a succession of three conserved motifs which generate an important diversity. These proteins are classified into three subfamilies based on the length and nature of their carboxy-terminal domain constituted by 1–6 motifs. All the motifs characterized have an important level of conservation in both sequence and spacing. A specific signature of this large family is defined. The presence of ESTs in databases and the detection of clones in A. thaliana cDNA libraries indicate that most of the genes of this family are expressed. The absence of similar sequences outside the plant kingdom strongly suggests that this unusually large orphan family is unique to plants. Features, the genesis, the potential function and the evolution of this plant combinatorial and modular protein family are discussed.  相似文献   

2.
Gene family size variation is an important mechanism that shapes the natural variation for adaptation in various species. Despite its importance, the pattern of gene family size variation in green plants is still not well understood. In particular, the evolutionary pattern of genes and gene families remains unknown in the model plant Arabidopsis thaliana in the context of green plants. In this study, eight representative genomes of green plants are sampled to study gene family evolution and characterize the origination of A. thaliana genes, respectively. Four important insights gained are that: (i) the rate of gene gains and losses is about 0.001359 per gene every million years, similar to the rate in yeast, Drosophila, and mammals; (ii) some gene families evolved rapidly with extreme expansions or contractions, and 2745 gene families present in all the eight species represent the ‘core’ proteome of green plants; (iii) 70% of A. thaliana genes could be traced back to 450 million years ago; and (iv) intriguingly, A. thaliana genes with early origination are under stronger purifying selection and more conserved. In summary, the present study provides genome‐wide insights into evolutionary history and mechanisms of genes and gene families in green plants and especially in A. thaliana.  相似文献   

3.
McCaig BC  Meagher RB  Dean JF 《Planta》2005,221(5):619-636
Completed genome sequences have made it clear that multicopper oxidases related to laccase are widely distributed as multigene families in higher plants. Laccase-like multicopper oxidase (LMCO) sequences culled from GenBank and the Arabidopsis thaliana genome, as well as those from several newly cloned genes, were used to construct a gene phylogeny that clearly divided plant LMCOs into six distinct classes, at least three of which predate the evolutionary divergence of angiosperms and gymnosperms. Alignments of the predicted amino acid sequences highlighted regions of variable sequence flanked by the highly conserved copper-binding domains that characterize members of this enzyme family. All of the predicted proteins contained apparent signal sequences. The expression of 13 of the 17 LMCO genes in A. thaliana was assessed in different tissues at various stages of development using RT-PCR. A diversity of expression patterns was demonstrated with some genes being expressed in a constitutive fashion, while others were only expressed in specific tissues at a particular stage of development. Only a few of the LMCO genes were expressed in a pattern that could be considered consistent with a major role for these enzymes in lignin deposition. These results are discussed in the context of other potential physiological functions for plant LMCOs, such as iron metabolism and wound healing.  相似文献   

4.
Owing to duplication events in its progenitor, more than 90% of the genes in the Arabidopsis thaliana genome are members of multigene families. A set of 2108 gene families, each consisting of precisely two unlinked paralogous genes, was identified in the nuclear genome of A. thaliana on the basis of sequence similarity. A systematic method for the creation of double knock‐out lines for such gene pairs, designated as DUPLO lines, was established and 200 lines are now publicly available. Their initial phenotypic characterisation led to the identification of seven lines with defects that emerge only in the adult stage. A further six lines display seedling lethality and 23 lines were lethal before germination. Another 14 lines are known to show phenotypes under non‐standard conditions or at the molecular level. Knock‐out of gene pairs with very similar coding sequences or expression profiles is more likely to produce a mutant phenotype than inactivation of gene pairs with dissimilar profiles or sequences. High coding sequence similarity and highly similar expression profiles are only weakly correlated, implying that promoter and coding regions of these gene pairs display different degrees of diversification.  相似文献   

5.
Lateral gene transfer (LGT) is an important mechanism of natural variation among prokaryotes. Over the full course of evolution, most or all of the genes resident in a given prokaryotic genome have been affected by LGT, yet the frequency of LGT can vary greatly across genes and across prokaryotic groups. The proteobacteria are among the most diverse of prokaryotic taxa. The prevalence of LGT in their genome evolution calls for the application of network-based methods instead of tree-based methods to investigate the relationships among these species. Here, we report networks that capture both vertical and horizontal components of evolutionary history among 1,207,272 proteins distributed across 329 sequenced proteobacterial genomes. The network of shared proteins reveals modularity structure that does not correspond to current classification schemes. On the basis of shared protein-coding genes, the five classes of proteobacteria fall into two main modules, one including the alpha-, delta-, and epsilonproteobacteria and the other including beta- and gammaproteobacteria. The first module is stable over different protein identity thresholds. The second shows more plasticity with regard to the sequence conservation of proteins sampled, with the gammaproteobacteria showing the most chameleon-like evolutionary characteristics within the present sample. Using a minimal lateral network approach, we compared LGT rates at different phylogenetic depths. In general, gene evolution by LGT within proteobacteria is very common. At least one LGT event was inferred to have occurred in at least 75% of the protein families. The average LGT rate at the species and class depth is about one LGT event per protein family, the rate doubling at the phylum level to an average of two LGT events per protein family. Hence, our results indicate that the rate of gene acquisition per protein family is similar at the level of species (by recombination) and at the level of classes (by LGT). The frequency of LGT per genome strongly depends on the species lifestyle, with endosymbionts showing far lower LGT frequencies than free-living species. Moreover, the nature of the transferred genes suggests that gene transfer in proteobacteria is frequently mediated by conjugation.  相似文献   

6.
7.
The plant cell wall is of supermolecular architecture, and is composed of various types of heterogeneous polymers. A few thousand enzymes and structural proteins are directly involved in the construction processes, and in the functional aspects of the dynamic architecture in Arabidopsis thaliana. Most of these proteins are encoded by multigene families, and most members within each family share significant similarities in structural features, but often exhibit differing expression profiles and physiological functions. Thus, for the molecular dissection of cell wall dynamics, it is necessary to distinguish individual members within a family of proteins. As a first step towards characterizing the processes involved in cell wall dynamics, we have manufactured a gene-specific 70-mer oligo microarray that consists of 765 genes classified into 30 putative families of proteins that are implicated in the cell wall dynamics of Arabidopsis. By using this array system, we identified several sets of genes that exhibit organ preferential expression profiles. We also identified gene sets that are expressed differentially at certain specific growth stages of the Arabidopsis inflorescence stem. Our results indicate that there is a division of roles among family members within each of the putative cell wall-related gene families.  相似文献   

8.
Prolamin and resistance gene families are important in wheat food use and in defense against pathogen attacks, respectively. To better understand the evolution of these multi‐gene families, the DNA sequence of a 2.8‐Mb genomic region, representing an 8.8 cM genetic interval and harboring multiple prolamin and resistance‐like gene families, was analyzed in the diploid grass Aegilops tauschii, the D‐genome donor of bread wheat. Comparison with orthologous regions from rice, Brachypodium, and sorghum showed that the Ae. tauschii region has undergone dramatic changes; it has acquired more than 80 non‐syntenic genes and only 13 ancestral genes are shared among these grass species. These non‐syntenic genes, including prolamin and resistance‐like genes, originated from various genomic regions and likely moved to their present locations via sequence evolution processes involving gene duplication and translocation. Local duplication of non‐syntenic genes contributed significantly to the expansion of gene families. Our analysis indicates that the insertion of prolamin‐related genes occurred prior to the separation of the Brachypodieae and Triticeae lineages. Unlike in Brachypodium, inserted prolamin genes have rapidly evolved and expanded to encode different classes of major seed storage proteins in Triticeae species. Phylogenetic analyses also showed that the multiple insertions of resistance‐like genes and subsequent differential expansion of each R gene family. The high frequency of non‐syntenic genes and rapid local gene evolution correlate with the high recombination rate in the 2.8‐Mb region with nine‐fold higher than the genome‐wide average. Our results demonstrate complex evolutionary dynamics in this agronomically important region of Triticeae species.  相似文献   

9.
Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the first non-grass Poales sequenced to date, and we present comparisons of genome organization and sequence evolution within Poales. Our results confirm that grass plastid genomes exhibit acceleration in both genomic rearrangements and nucleotide substitutions. Poaceae have multiple structural rearrangements, including three inversions, three genes losses (accD, ycf1, ycf2), intron losses in two genes (clpP, rpoC1), and expansion of the inverted repeat (IR) into both large and small single-copy regions. These rearrangements are restricted to the Poaceae, and IR expansion into the small single-copy region correlates with the phylogeny of the family. Comparisons of 73 protein-coding genes for 47 angiosperms including nine Poaceae genera confirm that the branch leading to Poaceae has significantly accelerated rates of change relative to other monocots and angiosperms. Furthermore, rates of sequence evolution within grasses are lower, indicating a deceleration during diversification of the family. Overall there is a strong correlation between accelerated rates of genomic rearrangements and nucleotide substitutions in Poaceae, a phenomenon that has been noted recently throughout angiosperms. The cause of the correlation is unknown, but faulty DNA repair has been suggested in other systems including bacterial and animal mitochondrial genomes.  相似文献   

10.
Summary The genome of Tetrahymena pyriformis has been shown to contain a ubiquitin multigene family consisting of several polyubiquitin genes and at least one ubiquitin fusion gene. We report here the isolation and characterization of one genomic clone (pTUl1), that encodes a ubiquitin extension protein. A comparison of the predicted amino acid sequence of the ubiquitin extension protein gene of T. pyriformis with those from other organisms indicated a high degree of homology. However, the Tetrahymena ubiquitin extension protein contains 53 and not 52 amino acids. This feature is different from all ubiquitin 52-amino-acid extension protein genes thus far sequenced. Furthermore, we found an array of four cysteine residues similar to those found in nucleic acid binding proteins. Also, the C-terminal sequence possesses a conserved motif which may represent a nuclear translocation signal. The ubiquitin 53-amino-acid extension protein gene encodes the smallest class of ubiquitin mRNAs in T. pyriformis.  相似文献   

11.
The role of lateral gene transfer (LGT) in prokaryotes has been shown to rapidly change the genome content, providing new gene tools for environmental adaptation. Features related to pathogenesis and resistance to strong selective conditions have been widely shown to be products of gene transfer between bacteria. The genomes of the γ-proteobacteria from the genus Xanthomonas, composed mainly of phytopathogens, have potential genomic islands that may represent imprints of such evolutionary processes. In this work, the evolution of genes involved in the pathway responsible for arginine biosynthesis in Xanthomonadales was investigated, and several lines of evidence point to the foreign origin of the arg genes clustered within a potential operon. Their presence inside a potential genomic island, bordered by a tRNA gene, the unusual ranking of sequence similarity, and the atypical phylogenies indicate that the metabolic pathway for arginine biosynthesis was acquired through LGT in the Xanthomonadales group. Moreover, although homologues were also found in Bacteroidetes (Flavobacteria group), for many of the genes analyzed close homologues are detected in different life domains (Eukarya and Archaea), indicating that the source of these arg genes may have been outside the Bacteria clade. The possibility of replacement of a complete primary metabolic pathway by LGT events supports the selfish operon hypothesis and may occur only under very special environmental conditions. Such rare events reveal part of the history of these interesting mosaic Xanthomonadales genomes, disclosing the importance of gene transfer modifying primary metabolism pathways and extending the scenario for bacterial genome evolution.  相似文献   

12.
Sucrose synthase is a key enzyme in sucrose metabolism in plant cells, and it is involved in the synthesis of cell wall cellulose. Although the sucrose synthase gene (SUS) family in the model plants Arabidopsis thaliana has been characterized, little is known about this gene family in trees. This study reports the identification of two novel SUS genes in the economically important poplar tree. These genes were expressed predominantly in mature xylem. Using molecular cloning and bioinformatics analysis of the Populus genome, we demonstrated that SUS is a multigene family with seven members that each exhibit distinct but partially overlapping expression patterns. Of particular interest, three SUS genes were preferentially expressed in the stem xylem, suggesting that poplar SUSs are involved in the formation of the secondary cell wall. Gene structural and phylogenetic analyses revealed that the Populus SUS family is composed of four main subgroups that arose before the separation of monocots and dicots. Phylogenetic analyses associated with the tissue- and organ-specific expression patterns. The high intraspecific nucleotide diversity of two SUS genes was detected in the natural population, and the π nonsyn/π syn ratio was significantly less than 1; therefore, SUS genes appear to be evolving in Populus, primarily under purifying selection. This is the first comprehensive study of the SUS gene family in woody plants; the analysis includes genome organization, gene structure, and phylogeny across land plant lineages, as well as expression profiling in Populus.  相似文献   

13.
Chromosomal organization and the evolution of genome architecture can be investigated by physical mapping of the genes for 45S and 5S ribosomal DNAs (rDNAs) and by the analysis of telomeric sequences. We studied 12 species of bats belonging to four subfamilies of the family Phyllostomidae in order to correlate patterns of distribution of heterochromatin and the multigene families for rDNA. The number of clusters for 45S gene ranged from one to three pairs, with exclusively location in autosomes, except for Carollia perspicillata that had in X chromosome. The 5S gene all the species studied had only one site located on an autosomal pair. In no species the 45S and 5S genes collocated. The fluorescence in situ hybridization (FISH) probe for telomeric sequences revealed fluorescence on all telomeres in all species, except in Carollia perspicillata. Non-telomeric sites in the pericentromeric region of the chromosomes were observed in most species, ranged from one to 12 pairs. Most interstitial telomeric sequences were coincident with heterochromatic regions. The results obtained in the present work indicate that different evolutionary mechanisms are acting in Phyllostomidae genome architecture, as well as the occurrence of Robertsonian fusion during the chromosomal evolution of bats without a loss of telomeric sequences. These data contribute to understanding the organization of multigene families and telomeric sequences on bat genome as well as the chromosomal evolutionary history of Phyllostomidae bats.  相似文献   

14.
15.
Elucidation of genome sequence provides an excellent platform to understand detailed complexity of the various gene families. Hsp100 is an important family of chaperones in diverse living systems. There are eight putative gene loci encoding for Hsp100 proteins in Arabidopsis genome. In rice, two full-length Hsp100 cDNAs have been isolated and sequenced so far. Analysis of rice genomic sequence by in silico approach showed that two isolated rice Hsp100 cDNAs correspond to Os05g44340 and Os02g32520 genes in the rice genome database. There appears to be three additional proteins (encoded by Os03g31300, Os04g32560 and Os04g33210 gene loci) that are variably homologous to Os05g44340 and Os02g32520 throughout the entire amino acid sequence. The above five rice Hsp100 genes show significant similarities in the signature sequences known to be conserved among Hsp100 proteins. While Os05g44340 encodes cytoplasmic Hsp100 protein, those encoded by the other four genes are predicted to have chloroplast transit peptides.  相似文献   

16.
During its life cycle, the protist parasite Entamoeba histolytica encounters reactive oxygen and nitrogen species that alter its genome. Base excision repair (BER) is one of the most important pathways for the repair of DNA base lesions. Analysis of the E. histolytica genome revealed the presence of most of the BER components. Surprisingly, this included a gene encoding an apurinic/apyrimidinic (AP) endonuclease that previous studies had assumed was absent. Indeed, our analysis showed that the genome of E. histolytica harbors the necessary genes needed for both short and long-patch BER sub-pathways. These genes include DNA polymerases with predicted 5′-dRP lyase and strand-displacement activities and a sole DNA ligase. A distinct feature of the E. histolytica genome is the lack of several key damage-specific BER glycosylases, such as OGG1/MutM, MDB4, Mag1, MPG, SMUG, and TDG. Our evolutionary analysis indicates that several E. histolytica DNA glycosylases were acquired by lateral gene transfer (LGT). The genes that encode for MutY, AlkD, and UDG (Family VI) are included among these cases. Endonuclease III and UNG (family I) are the only DNA glycosylases with a eukaryotic origin in E. histolytica. A gene encoding a MutT 8-oxodGTPase was also identified that was acquired by LGT. The mixed composition of BER genes as a DNA metabolic pathway shaped by LGT in E. histolytica indicates that LGT plays a major role in the evolution of this eukaryote. Sequence and structural prediction of E. histolytica DNA glycosylases, as well as MutT, suggest that the E. histolytica DNA repair proteins evolved to harbor structural modifications that may confer unique biochemical features needed for the biology of this parasite.  相似文献   

17.
18.
Biological and computer-assisted analyses of a 25 kb fragment from Arabidopsis thaliana chromosome IV led to the characterization of two multigene families and three novel orphan genes, not previously described. The first gene family named AtMO1-4 encodes monooxygenases, related to the prokaryotic salicylate hydroxylases. The second gene family contains three members, two on the analysed 25 kb fragment and one on chromosome I. The latter three genes lack introns and are homologous to the previously studied Glycine max src2 gene which is overexpressed at low temperature. Gene expression and primary structure of the deduced proteins are described and compared. Three genes of unknown function, showing tissue specific expressions, are characterized on the 25 kb fragment. Full length or partial cognate cDNAs have been sequenced for all the genes studied.  相似文献   

19.
Ubiquitin (Ub)-conjugating enzymes (E2) are key enzymes in ubiquitination or Ub-like modifications of proteins. We searched for all proteins belonging to the E2 enzyme super-family in seven species (Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Schizosaccharomyces pombe, Saccharomyces cerevisiae, and Arabidopsis thaliana) to identify families and to reconstruct each family’s phylogeny. Our phylogenetic analysis of 207 genes led us to define 17 E2 families, with 37 E2 genes, in the human genome. The subdivision of E2 into four classes did not correspond to the phylogenetic tree. The sequence signature HPN (histidine–proline–asparagine), followed by a tryptophan residue at 16 (up to 29) amino acids, was highly conserved. When present, the active cysteine was found 7 to 8 amino acids from the C-terminal end of HPN. The secondary structures were characterized by a canonical alpha/beta fold. Only family 10 deviated from the common organization because the proteins were devoid of enzymatic activity. Family 7 had an insertion between beta strands 1 and 2; families 3, 5 and 14 had an insertion between the active cysteine and the conserved tryptophan. The three-dimensional data of these proteins highlight a strong structural conservation of the core domain. Our analysis shows that the primitive eukaryote ancestor possessed a diversified set of E2 enzymes, thus emphasizing the importance of the Ub pathway. This comprehensive overview of E2 enzymes emphasizes the diversity and evolution of this superfamily and helps clarify the nomenclature and true orthologies. A better understanding of the functions of these enzymes is necessary to decipher several human diseases. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

20.
Similar to the Igh-V multigene family, the human or mouse Igk-V repertoirer is a distorted continuum of homologous genes that may be grouped into families displaying >80% nucleic acid sequence similarity among their members. systematic interspecies sequence comparisons reveal that most human Igk-V gene families exhibit clear homology to mouse Ogk-V families (sequence similarity >74%). A hypothetical phylogenetic tree of Igk-V genes predicts that a minimum of seven Igk-V genes/families predate mammalian radiation. In two cases, several interrelated mouse Igk-V families exhibit phylogenetic equidistance with just one human Igk-V family, implying a more pronounced divergence for the elevated number of Igk-V gene families in the mouse. Mouse-human Igk-V comaprisons, moreover, illustrate how expansion, contraction, and perhaps deletion of Igk-V gene families shape the Igk-V repertoire during mammalian evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号