首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We compared the exon/intron organization of vertebrate genes belonging to different isochore classes, as predicted by their GC content at third codon position. Two main features have emerged from the analysis of sequences published in GenBank: (1) genes coding for long proteins (i.e., 500 aa) are almost two times more frequent in GC-poor than in GC-rich isochores; (2) intervening sequences (=sum of introns) are on average three times longer in GC-poor than in GC-rich isochores. These patterns are observed among human, mouse, rat, cow, and even chicken genes and are therefore likely to be common to all warm-blooded vertebrates. Analysis of Xenopus sequences suggests that the same patterns exist in cold-blooded vertebrates. It could be argued that such results do not reflect the reality because sequence databases are not representative of entire genomes. However, analysis of biases in GenBank revealed that the observed discrepancies between GC-rich and GC-poor isochores are not artifactual, and are probably largely underestimated. We investigated the distribution of microsatellites and interspersed repeats in introns of human and mouse genes from different isochores. This analysis confirmed previous studies showing that Ll repeats are almost absent from GC-rich isochores. Microsatellites and SINES (Alu, B1, B2) are found at roughly equal frequencies in introns from all isochore classes. Globally, the presence of repeated sequences does not account for the increased intron length in GC-poor isochores. The relationships between gene structure and global genome organization and evolution are discussed.  相似文献   

2.
Summary The compositional distribution of coding sequences from five vertebrates (Xenopus, chicken, mouse, rat, and human) is shifted toward higher GC values compared to that of the DNA molecules (in the 35–85-kb size range) isolated from the corresponding genomes. This shift is due to the lower GC levels of intergenic sequences compared to coding sequences. In the cold-blooded vertebrate, the two distributions are similar in that GC-poor genes and GC-poor DNA molecules are largely predominant. In contrast, in the warm-blooded vertebrates, GC-rich genes are largely predominant over GC-poor genes, whereas GC-poor DNA molecules are largely predominant over GC-rich DNA molecules. As a consequence, the genomes of warm-blooded vertebrates show a compositional gradient of gene concentration. The compositional distributions of coding sequences (as well as of DNA molecules) showed remarkable differences between chicken and mammals, and between mouse (or rat) and human. Differences were also detected in the compositional distribution of housekeeping and tissue-specific genes, the former being more abundant among GC-rich genes.  相似文献   

3.
S Zoubak  A Rynditch  G Bernardi 《Gene》1992,119(2):207-213
The compositional distributions of genomes, genes (and their third codon positions) and long terminal repeats from retroviruses of warm-blooded vertebrates are characterized by a striking bimodality which is accompanied by a remarkable compositional homogeneity within each retroviral genome. A first, major class of retroviral genomes is GC-rich, whereas a second, minor class is GC-poor. Representative expressed viral genomes from the two classes integrate in GC-rich and GC-poor isochores, respectively, of host genomes. The first class comprises all oncoviruses (except B-types and some D-types), the second, lentiviruses, spumaviruses, as well as B-type and some D-type oncoviruses (e.g., mouse mammary tumor virus and simian retroviruses type D, respectively). The compositional bimodal distribution of retroviral genomes and the accompanying compositional homogeneity within each retroviral genome appear to be the result of the compositional evolution of retroviral genomes in their integrated form.  相似文献   

4.
We have found previously that the sequences important for recognition of pre-mRNA introns in dicot plants differ from those in the introns of vertebrates and yeast. Neither a conserved branch point nor a polypyrimidine tract, found in yeast and vertebrate introns respectively, are required. Instead, AU-rich sequences, a characteristic feature of dicot plant introns, are essential. Here we show that splicing in protoplasts of maize, a monocot, differs significantly from splicing in a dicot, Nicotiana plumbaginifolia. As in the case of dicots, a conserved branch point and a polypyrimidine tract are not required for intron processing in maize. However, unlike in dicots, AU-rich sequences are not essential, although their presence facilitates splicing if the splice site sequences are not optimal. The lack of an absolute requirement for AU-rich stretches in monocot introns in reflected in the occurrence of GC-rich introns in monocots but not in dicots. We also show that maize protoplasts are able to process a mammalian intron and short introns containing stem--loops, neither of which are spliced in N.plumbaginifolia protoplasts. The ability of maize, but not of N.plumbaginifolia to process stem--loop-containing or GC-rich introns suggests that one of the functions of AU-rich sequences during splicing of dicot plant pre-mRNAs may be to minimize secondary structure within the intron.  相似文献   

5.
Carels N 《FEBS letters》2005,579(18):3867-3871
Previous investigations by Southern hybridization of cDNA with compositional DNA fractions showed that the majority of maize genes are located in a narrow GC range of DNA fragments and that the corresponding gene space was GC-richer than the region of the genome where zein genes are found. Here, we revisited the maize gene space using new data from the maize genome sequencing initiative. We found that the maize gene space itself is formed of two compositional compartments, i.e., a GC-poor and a GC-rich, characterized by a different distribution of Opie and Huck retrotransposons. The GC-rich compartment tends to be richer in GC-rich genes than the GC-poor compartment. However, the gene space compartimentalization of maize is much simpler than that of human.  相似文献   

6.
Mukhopadhyay P  Basak S  Ghosh TC 《Gene》2007,400(1-2):71-81
Synonymous codon usage and cellular tRNA abundance are thought to be co-evolved in optimizing translational efficiencies in highly expressed genes. Here in this communication by taking the advantage of publicly available gene expression data of rice and Arabidopsis we demonstrated that tRNA gene copy number is not the only driving force favoring translational selection in all highly expressed genes of rice. We found that forces favoring translational selection differ between GC-rich and GC-poor classes of genes. Supporting our results we also showed that, in highly expressed genes of GC-poor class there is a perfect correspondence between majority of preferred codons and tRNA gene copy number that confers translational efficiencies to this group of genes. However, tRNA gene copy number is not fully consistent with models of translational selection in GC-rich group of genes, where constraints on mRNA secondary structure play a role to optimize codon usage in highly expressed genes.  相似文献   

7.
Asakura Y  Barkan A 《Plant physiology》2006,142(4):1656-1663
Chloroplast genomes in plants and green algae contain numerous group II introns, large ribozymes that splice via the same chemical steps as spliceosome-mediated splicing in the nucleus. Most chloroplast group II introns are degenerate, requiring interaction with nucleus-encoded proteins to splice in vivo. Genetic approaches in maize (Zea mays) and Chlamydomonas reinhardtii have elucidated distinct sets of proteins that assemble with chloroplast group II introns and facilitate splicing. Little information is available, however, concerning these processes in Arabidopsis (Arabidopsis thaliana). To determine whether the paucity of data concerning chloroplast splicing factors in Arabidopsis reflects a fundamental difference between protein-facilitated group II splicing in monocot and dicot plants, we examined the mutant phenotypes associated with T-DNA insertions in Arabidopsis genes encoding orthologs of the maize chloroplast splicing factors CRS1, CAF1, and CAF2 (AtCRS1, AtCAF1, and AtCAF2). We show that the splicing functions and intron specificities of these proteins are largely conserved between maize and Arabidopsis, indicating that these proteins were recruited to promote the splicing of plastid group II introns prior to the divergence of monocot and dicot plants. We show further that AtCAF1 promotes the splicing of two group II introns, rpoC1 and clpP-intron 1, that are found in Arabidopsis but not in maize; AtCAF1 is the first splicing factor described for these introns. Finally, we show that a strong AtCAF2 allele conditions an embryo-lethal phenotype, adding to the body of data suggesting that cell viability is more sensitive to the loss of plastid translation in Arabidopsis than in maize.  相似文献   

8.
The genomes of homeothermic (warm-blooded) vertebrates are mosaic interspersions of homogeneously GC-rich and GC-poor regions (isochores). Evolution of genome compartmentalization and GC-rich isochores is hypothesized to reflect either selective advantages of an elevated GC content or chromosome location and mutational pressure associated with the timing of DNA replication in germ cells. To address the present controversy regarding the origins and maintenance of isochores in homeothermic vertebrates, newly obtained as well as published nucleotide sequences of the insulin and insulin-like growth factor (IGF) genes, members of a well-characterized gene family believed to have evolved by repeated duplication and divergence, were utilized to examine the evolution of base composition in nonconstrained (flanking) and weakly constrained (introns and fourfold degenerate sites) regions. A phylogeny derived from amino acid sequences supports a common evolutionary history for the insulin/IGF family genes. In cold- blooded vertebrates, insulin and the IGFs were similar in base composition. In contrast, insulin and IGF-II demonstrate dramatic increases in GC richness in mammals, but no such trend occurred in IGF- I. Base composition of the coding portions of the insulin and IGF genes across vertebrates correlated (r = 0.90) with that of the introns and flanking regions. The GC content of homologous introns differed dramatically between insulin/IGF-II and IGF-I genes in mammals but was similar to the GC level of noncoding regions in neighboring genes. Our findings suggest that the base composition of introns and flanking regions is determined by chromosomal location and the mutational pressure of the isochore in which the sequences are embedded. An elevated GC content at codon third positions in the insulin and the IGF genes may reflect selective constraints on the usage of synonymous codons.   相似文献   

9.
Plant introns are typically AU-rich or U-rich, and this feature has been shown to be important for splicing. In maize, however, about 20% of the introns exceed 50% GC, and most of them are efficiently spliced. A series of constructs has been designed to analyze the cis requirements for splicing of the GC-rich Bz2 maize intron and two other GC-rich intron derivatives. By manipulating exon, intron and splice site sequences it is shown that exons can play an important role in intron definition: changes in exon sequences can increase splicing efficiency of a GC-rich intron from 17% to 86%. The relative difference, or base compositional contrast, in GC and U content between exon and intron sequences in the vicinity of splice sites, rather than the absolute base-content of the intron or exons, correlates with splicing efficiency. It is also shown that GC-rich intron constructs that are poorly spliced can be partially rescued by an improved 3' splice site.  相似文献   

10.
Intron-exon structures of eukaryotic model organisms.   总被引:27,自引:1,他引:27       下载免费PDF全文
To investigate the distribution of intron-exon structures of eukaryotic genes, we have constructed a general exon database comprising all available intron-containing genes and exon databases from 10 eukaryotic model organisms: Homo sapiens, Mus musculus, Gallus gallus, Rattus norvegicus, Arabidopsis thaliana, Zea mays, Schizosaccharomyces pombe, Aspergillus, Caenorhabditis elegans and Drosophila. We purged redundant genes to avoid the possible bias brought about by redundancy in the databases. After discarding those questionable introns that do not contain correct splice sites, the final database contained 17 102 introns, 21 019 exons and 2903 independent or quasi-independent genes. On average, a eukaryotic gene contains 3.7 introns per kb protein coding region. The exon distribution peaks around 30-40 residues and most introns are 40-125 nt long. The variable intron-exon structures of the 10 model organisms reveal two interesting statistical phenomena, which cast light on some previous speculations. (i) Genome size seems to be correlated with total intron length per gene. For example, invertebrate introns are smaller than those of human genes, while yeast introns are shorter than invertebrate introns. However, this correlation is weak, suggesting that other factors besides genome size may also affect intron size. (ii) Introns smaller than 50 nt are significantly less frequent than longer introns, possibly resulting from a minimum intron size requirement for intron splicing.  相似文献   

11.
12.
Sequencing of eukaryotic genomes allows one to address major evolutionary problems, such as the evolution of gene structure. We compared the intron positions in 684 orthologous gene sets from 8 complete genomes of animals, plants, fungi, and protists and constructed parsimonious scenarios of evolution of the exon-intron structure for the respective genes. Approximately one-third of the introns in the malaria parasite Plasmodium falciparum are shared with at least one crown group eukaryote; this number indicates that these introns have been conserved through >1.5 billion years of evolution that separate Plasmodium from the crown group. Paradoxically, humans share many more introns with the plant Arabidopsis thaliana than with the fly or nematode. The inferred evolutionary scenario holds that the common ancestor of Plasmodium and the crown group and, especially, the common ancestor of animals, plants, and fungi had numerous introns. Most of these ancestral introns, which are retained in the genomes of vertebrates and plants, have been lost in fungi, nematodes, arthropods, and probably Plasmodium. In addition, numerous introns have been inserted into vertebrate and plant genes, whereas, in other lineages, intron gain was much less prominent.  相似文献   

13.
Angiosperms (flowering plants), including both monocots and dicots, contain small catalase gene families. In the dicot, Arabidopsis thaliana, two catalase (CAT) genes, CAT1 and CAT3, are tightly linked on chromosome 1 and a third, CAT2, which is more similar to CAT1 than to CAT3, is unlinked on chromosome 4. Comparison of positions and numbers of introns among 13 angiosperm catalase genomic sequences indicates that intron positions are conserved, and suggests that an ancestral catalase gene common to monocots and dicots contained seven introns. Arabidopsis CAT2 has seven introns; both CAT1 and CAT3 have six introns in positions conserved with CAT2, but each has lost a different intron. We suggest the following sequence of events during the evolution of the Arabidopsis catalase gene family. An initial duplication of an ancestral catalase gene gave rise to CAT3 and CAT1. CAT1 then served as the template for a second duplication, yielding CAT2. Intron losses from CAT1 and CAT3 followed these duplications. One subclade of monocot catalases has lost all but the 5''-most and 3''-most introns, which is consistent with a mechanism of intron loss by replacement of an ancestral intron-containing gene with a reverse-transcribed DNA copy of a fully spliced mRNA. Following this event of concerted intron loss, the Oryza sativa (rice, a monocot) CAT1 lineage acquired an intron in a novel position, consistent with a mechanism of intron gain at proto-splice sites.  相似文献   

14.
The compositional distributions of large (main-band) DNA fragments from eight birds belonging to eight different orders (including both paleognathous and neognathous species) are very broad and extremely close to each other. These findings, which are paralleled by the compositional similarity of homologous coding sequences and their codon positions, support the idea that birds are a monophyletic group.The compositional distribution of third-codon positions of genes from chicken, the only avian species for which a relatively large number of coding sequences is known, is very broad and bimodal, the minor GC-richer peak reaching 100% GC. The very high compositional heterogeneity of avian genomes is accompanied (as in the case of mammalian genomes) by a very high speciation rate compared to cold-blooded vertebrates which are characterized by genomes that are much less heterogeneous. The higher GC levels attained by avian compared to mammalian genomes might be correlated with the higher body temperature (41–43°C) of birds compared to mammals (37°C).A comparison of GC levels of coding sequences and codon positions from man and chicken revealed very close average GC levels and standard deviations. Homologous coding sequences and codon positions from man and chicken showed a surprisingly high degree of compositional similarity which was, however, higher for GC-poor than for GC-rich sequences. This indicates that GC-poor isochores of warm-blooded vertebrates reflect the composition of the isochores of the genome of the common reptilian ancestor of mammals and birds, which underwent only a small compositional change at the transition from cold- to warm-blooded vertebrates. In contrast, the GC-rich isochores of birds and mammals are the result of large compositional changes at the same evolutionary transition, where were in part different in the two classes of warm-blooded vertebrates.Correspondence to: G. Bernaadi  相似文献   

15.
G Matassi  R Melis  K C Kuo  G Macaya  C W Gehrke  G Bernardi 《Gene》1992,122(2):239-245
Methylation was investigated in compositional fractions of nuclear DNA preparations (50-100 kb in size) from five plants (onion, maize, rye, pea and tobacco), and was found to increase from GC-poor to GC-rich fractions. This methylation gradient showed different patterns in different plants and appears, therefore, to represent a novel, characteristic genome feature which concerns the noncoding, intergenic sequences that make up the bulk of the plant genomes investigated and mainly consist of repetitive sequences. The structural and functional implications of these results are discussed.  相似文献   

16.
Jeong YM  Mun JH  Lee I  Woo JC  Hong CB  Kim SG 《Plant physiology》2006,140(1):196-209
Profilin is a small actin-binding protein that regulates cellular dynamics of the actin cytoskeleton. In Arabidopsis (Arabidopsis thaliana), five profilins were identified. The vegetative class profilins, PRF1, PRF2, and PRF3, are expressed in vegetative organs. The reproductive class profilins, PRF4 and PRF5, are mainly expressed in pollen. In this study, we examined the role of the first intron in the expression of the Arabidopsis profilin gene family using transgenic plants and a transient expression system. In transgenic plants, we examined PRF2 and PRF5, which represent vegetative and reproductive profilins. The expression of the PRF2 promoter fused with the beta-glucuronidase (GUS) gene was observed in the vascular bundles, but transgenic plants carrying the PRF2 promoter-GUS with its first intron showed constitutive expression throughout the vegetative tissues. However, the first intron of PRF5 had little effect on the reporter gene expression pattern. Transgenic plants containing PRF5 promoter-GUS fusion with or without its first intron showed reproductive tissue-specific expression. To further investigate the different roles of the first two introns on gene expression, the first introns were exchanged between PRF2 and PRF5. The first intron of PRF5 had no apparent effect on the expression pattern of the PRF2 promoter. But, unlike the intron of PRF5, the first intron of PRF2 greatly affected the reproductive tissue-specific expression of the PRF5 promoter, confirming a different role for these introns. The results of a transient expression assay indicated that the first intron of PRF1 and PRF2 enhances gene expression, whereas PRF4 and PRF5 do not. These results suggest that the first introns of profilin genes are functionally distinctive and the first introns are required for the strong and constitutive gene expression of PRF1 and PRF2 in vegetative tissues.  相似文献   

17.
Asakura Y  Barkan A 《The Plant cell》2007,19(12):3864-3875
The CRM domain is a recently recognized RNA binding domain found in three group II intron splicing factors in chloroplasts, in a bacterial protein that associates with ribosome precursors, and in a family of uncharacterized proteins in plants. To elucidate the functional repertoire of proteins with CRM domains, we studied CFM2 (for CRM Family Member 2), which harbors four CRM domains. RNA coimmunoprecipitation assays showed that CFM2 in maize (Zea mays) chloroplasts is associated with the group I intron in pre-trnL-UAA and group II introns in the ndhA and ycf3 pre-mRNAs. T-DNA insertions in the Arabidopsis thaliana ortholog condition a defective-seed phenotype (strong allele) or chlorophyll-deficient seedlings with impaired splicing of the trnL group I intron and the ndhA, ycf3-int1, and clpP-int2 group II introns (weak alleles). CFM2 and two previously described CRM proteins are bound simultaneously to the ndhA and ycf3-int1 introns and act in a nonredundant fashion to promote their splicing. With these findings, CRM domain proteins are implicated in the activities of three classes of catalytic RNA: group I introns, group II introns, and 23S rRNA.  相似文献   

18.
19.
20.
Some of the principal transitions in the evolution of eukaryotes are characterized by engulfment of prokaryotes by primitive eukaryotic cells. In particular, approximately 1.6 billion years ago, engulfment of a cyanobacterium that became the ancestor of chloroplasts and other plastids gave rise to Plantae, the major branch of eukaryotes comprised of glaucophytes, red algae, green algae, and green plants. After endosymbiosis, there was large-scale migration of genes from the endosymbiont to the nuclear genome of the host such that approximately 18% of the nuclear genes in Arabidopsis appear to be of chloroplast origin. To gain insights into the process of evolution of gene structure in these, originally, intronless genes, we compared the properties and the evolutionary dynamics of introns in genes of plastid origin and ancestral eukaryotic genes in Arabidopsis, poplar, and rice genomes. We found that intron densities in plastid-derived genes were slightly but significantly lower than those in ancestral eukaryotic genes. Although most of the introns in both categories of genes were conserved between monocots (rice) and dicots (Arabidopsis and poplar), lineage-specific intron gain was more pronounced in plastid-derived genes than in ancestral genes, whereas there was no significant difference in the intron loss rates between the 2 classes of genes. Thus, after the transfer to the nuclear genome, the plastid-derived genes have undergone a massive intron invasion that, by the time of the divergence of dicots and monocots (150-200 MYA), yielded intron densities only slightly lower than those in ancestral genes. Nevertheless, the accumulation of introns in plastid-derived genes appears not to have reached saturation and continues to this time, albeit at a low rate. The overall pattern of intron gain and loss in the plastid-derived genes is shaped by this continuing gain and the more general tendency for loss that is characteristic of the recent evolution of plant genes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号