首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Recently, a new genetic process termed RNA editing has been identified showing insertions and deletions of nucleotides in particular RNA molecules. On the other hand, there are a few non-random statistical properties in genes: in particular, the periodicity modulo 3 (P3) associated with an open reading frame, the periodicity modulo 2 (P2) associated with alternating purine/pyrimidine stretches, the YRY(N)6YRY preferential occurrence (R = purine = adenine or guanine, Y = pyrimidine = cytosine or thymine, N = R or Y) representing a "code" of the DNA helix pitch, etc. The problem investigated here is whether a process of the type RNA editing can lead to the non-random statistical properties commonly observed in genes. This paper will show in particular that: The process of insertions and deletions of mononucleotides in the initial sequence [YRY(N)3]* [series of YRY(N)3] can lead to the periodicity modulo 2 (P2). The process of insertions and deletions of trinucleotides in the initial sequence [YRY(N)6]* [series of YRY(N)6] can lead to the periodicity modulo 3 (P3) and the YRY(N)6YRY preferential occurrence. Furthermore, these two processes lead to a strong correlation with the reality, namely the mononucleotide insertion/deletion process, with the 5' eukaryotic regions and the trinucleotide insertion/deletion process, with the eukaryotic protein coding genes.  相似文献   

2.
Periodicities in introns.   总被引:2,自引:1,他引:1       下载免费PDF全文
The sequence information for the splicing process of introns is found in the consensus sequences at the two splice sites. For long introns, of 300 or more nucleotides, the middle regions may provide additional specificity for splicing which can be investigated by defining an adequate quantitative parameter. This methodology permits to retrieve the coding periodicity in the viral and mitochondrial introns and to identify with a statistical significance, a surprising alternating purine-pyrimidine base sequence -i.e. a modulo 2 periodicity- in the eukaryotic introns, and particularly in the vertebrate introns. This alternating structure suggests that the vertebrate introns do not have the genetic information to code for proteins, they carry structural and regulatory functions.  相似文献   

3.
Two human gamma-crystallin genes are linked and riddled with Alu-repeats   总被引:7,自引:0,他引:7  
A human genomic cosmid clone, pHcos gamma-1, has been isolated containing two closely linked gamma-crystallin genes, oriented in the same direction. The sequence of these genes and their 5' and 3' flanking regions has been determined. The coding regions of both genes are interrupted by two introns. The first introns (94 and 100 bp, respectively) are located in the 5' region of the genes. The second introns (2.82 and 0.95 kb, respectively) divide the genes into two halves, each encoding a structural domain of the gamma-crystallin protein. The coding regions of the two genes show 80% homology. Due to a mutation in the splice acceptor site of the second intron of the first gene, the coding region of its third exon is 3 bp longer than that of the second gene. In the flanking regions several conserved sequence elements were found, including those elements that are known to be necessary for the correct expression of eukaryotic genes. The flanking and intronic regions of the genes contain 'simple sequence' DNA and Alu repeats. The Alu repeats are usually clustered, contain truncated elements, and are often located near simple sequence DNA.  相似文献   

4.
P Bucher  G Yagil 《DNA sequence》1991,1(3):157-172
A program to analyse the length and frequency distribution of specific base tracts in genomic sequences is described. The frequency of oligopurine.oligopyrimidine tracts (R.Y. tracts) in a data base of 163 transcribed genes is analysed and compared. The complete genomes of SV40 virus, N. tobacum chloroplast, yeast 2 micron plasmid, bacteriophage lambda, plasmid pBR322 and the E. coli lac operon are also analyzed. A highly significant overrepresentation of oligopurine and oligopyrimidine tracts is observed in all eukaryotic genes examined, as well as in the chloroplast genome. The overrepresentation is evident in all gene subregions of the chloroplast, in the following order: intergenic regions, 3' downstream and 5' upstream (promoter), 5' and 3' untranslated, introns and coding regions. In genes coding for basic proteins, oligopurine rather than oligopyrimidine tracts are found on the coding stand. In prokaryotic genes only the longest R.Y. tracts (greater than or equal to 12) are found in excess, and are concentrated near regulatory regions. While a structural role for R.Y. tracts is most likely in intergenic regions, a functional role, as initiation sites for strand separation, is proposed for regulatory gene regions.  相似文献   

5.
The nucleotide sequence of cDNA clones encoding the three major BIIIB high-sulfur wool keratin proteins (BIIIB2, 3, and 4) and the structure of a BIIIB4 gene and a BIIIB3 pseudogene are reported. Although Southern blot analysis indicates that the BIIIB genes comprise a multigene family in the sheep genome, they are poorly represented in genomic DNA libraries. The family sequence homology of the coding region extends into the 5' and 3' untranslated regions and the near 5' flanking region of the BIIIB3 and 4 genes. These homologies suggest that the BIIIB3 and 4 genes represent the latest gene duplication event in the evolution of the BIIIB multigene family. Like the genes coding for other wool keratin matrix protein components, the BIIIB genes have the conserved 18-bp sequence immediately 5' to the initiation codon and also appear to lack introns.  相似文献   

6.
Most research concerning the evolution of introns has largely considered introns within coding sequences (CDSs), without regard for introns located within untranslated regions (UTRs) of genes. Here, we directly determined intron size, abundance, and distribution in UTRs of genes using full-length cDNA libraries and complete genome sequences for four species, Arabidopsis thaliana, Drosophila melanogaster, human, and mouse. Overall intron occupancy (introns/exon kbp) is lower in 5' UTRs than CDSs, but intron density (intron occupancy in regions containing introns) tends to be higher in 5' UTRs than in CDSs. Introns in 5' UTRs are roughly twice as large as introns in CDSs, and there is a sharp drop in intron size at the 5' UTR-CDS boundary. We propose a mechanistic explanation for the existence of selection for larger intron size in 5' UTRs, and outline several implications of this hypothesis. We found introns to be randomly distributed within 5' UTRs, so long as a minimum required exon size was assumed. Introns in 3' UTRs were much less abundant than in 5' UTRs. Though this was expected for human and mouse that have intron-dependent nonsense-mediated decay (NMD) pathways that discourage the presence of introns within the 3' UTR, it was also true for A. thaliana and D. melanogaster, which may lack intron-dependent NMD. Our findings have several implications for theories of intron evolution and genome evolution in general.  相似文献   

7.
The rabbit genome encodes an opal suppressor tRNA gene. The coding region is strictly conserved between the rabbit gene and the corresponding gene in the human genome. The rabbit opal suppressor gene contains the consensus sequence in the 3' internal control region but like the human and chicken genes, the rabbit 5' internal control region contains two additional nucleotides. The 5' flanking sequences of the rabbit and the human opal suppressor genes contain extensive regions of homology. A subset of these homologies is also present 5' to the chicken opal suppressor gene. Both the rabbit and the human genomes also encode a pseudogene. That of the rabbit lacks the 3' half of the coding region. Neither pseudogene has homologous regions to the 5' flanking regions of the genes. The presence of 5' homologies flanking only the transcribed genes and not the pseudogenes suggests that these regions may be regulatory control elements specifically involved in the expression of the eukaryotic opal suppressor gene. Moreover the strict conservation of coding sequences indicates functional importance for the opal suppressor tRNA genes.  相似文献   

8.
The statistical study of different populations of genes, with principal component analyses (PCA) and with mean curves, shows: (1) The occurence probability of the dinucleotide D'=RY (D'=YR), R being a purine base, Y a pyrimidine base, and N any base, after the occurence of the nucleotide Y (R) in the zero modulo three curve—in the eukaryotic protein coding genes—presents a modulo 9 periodicity with a maximum value nine bases after Y (R). This modulo 9 periodicity is added to the existing coding modulo 3 periodicity RNY, i.e. to the preferential use of the codon RNY in the open reading frame. (2) The occurrence probability of the trinucleotide T=YRY after the occurrence of the nucleotide Y—in the protein coding genes of eukaryotes, chloroplasts, and mitochondria, and in the transfer RNA genes—presents a maximum value eight bases after Y. (3) Similar results are obtained (with less statistical significance) with other gene taxonomic groups (viral protein coding genes, viral introns) and with the complementary motif R(N)8RYR. These results may suggest that the motifs Y(N)8YRY, R(N)8RYR, Y(N)9RY, and R(N)9YR could have a function related to the spatial structure of the DNA sequences.  相似文献   

9.
Structure and evolution of the bovine prothrombin gene   总被引:6,自引:0,他引:6  
The cloned bovine prothrombin gene has been characterized by partial DNA sequence analysis, including the 5' and 3' flanking sequences and all the intron-exon junctions. The gene is approximately 15.4 x 10(3) base-pairs in length and comprises 14 exons interrupted by 13 introns. The exons coding for the prepro-leader peptide and the gamma-carboxyglutamic acid-containing region are similar in organization to the corresponding exons in the factor IX and protein C genes. This region has probably evolved as a result of recent gene duplication and exon shuffling events. The exons coding for the kringles and the serine protease region of the prothrombin gene are different in organization from the homologous regions in other genes, suggesting that introns have been inserted into these regions after the initial gene duplication events.  相似文献   

10.
At the final step in viral replication, the viral genome must be incorporated into progeny virions, yet the genomic regions required for this process are largely unknown in RNA viruses, including influenza virus. Recently, it was reported that both ends of the neuraminidase (NA) coding region are critically important for incorporation of this vRNA segment into influenza virions (Y. Fujii, H. Goto, T. Watanabe, T. Yoshida, and Y. Kawaoka, Proc. Natl. Acad. Sci. USA 100:2002-2007, 2003). To determine the signals in the hemagglutinin (HA) vRNA required for its virion incorporation, we made a series of deletion constructs of this segment. Subsequent analysis showed that 9 nucleotides at the 3' end of the coding region and 80 nucleotides at the 5' end are sufficient for efficient virion incorporation of the HA vRNA. The utility of this information for stable expression of foreign genes in influenza viruses was assessed by generating a virus whose HA and NA vRNA coding regions were replaced with those of vesicular stomatitis virus glycoprotein (VSVG) and green fluorescent protein (GFP), respectively, while retaining virion incorporation signals for these segments. Despite the lack of HA and NA proteins, the resultant virus, which possessed only VSVG on the virion surface, was viable and produced GFP-expressing plaques in cells even after repeated passages, demonstrating that two foreign genes can be incorporated and maintained stably in influenza A virus. These findings could serve as a model for the construction of influenza A viruses designed to express and/or deliver foreign genes.  相似文献   

11.
12.
13.
Organization of the human protein S genes   总被引:6,自引:0,他引:6  
Human genomic clones that span the entire protein S expressed gene (PS alpha) and the 3' two-thirds of the protein S pseudogene (PS beta) have been isolated and characterized. The PS alpha gene is greater than 80 kilobases in length and contains 14 introns and 15 exons, as well as 6 repetitive "Alu" sequences. Exons I and XV contain 112 and 1139 bp 5' and 3' noncoding segments in addition to the amino and carboxyl termini, respectively. Exons I-VIII encode protein segments that are homologous to the vitamin K dependent clotting proteins and are bounded by introns whose position and type are identical with other members of this protein family. Exons IX-XV encode protein segments homologous to sex hormone binding globulin (SHBG) and are bounded by introns of identical type and position as in the SHBG gene. Genomic clones for the PS beta gene cover a distance of greater than 55 kilobases and contain segments corresponding to amino acids 46-635 of the mature protein and the 1.1-kb 3' noncoding region of the cDNA. The presence of multiple base changes in the coding portions of this gene, resulting in termination codons and frame shifts, suggests that it is a pseudogene. Comparison of DNA sequences for the two genes reveals 97% identity for coding and 3' noncoding, and 95.4% for intronic regions, suggesting divergence of the two genes is a relatively recent event.  相似文献   

14.
The DNA sequence of a chicken genomal fragment containing a histone H2A gene has been determined. It contains extensive 5' and 3' flanking regions and encodes a protein identical in sequence to the histone H2A protein isolated from chicken erythrocytes. In the 5' flanking region, a possible "TATA box" and three possible "cap sites" can be recognised upstream from the initiation codon. To the 5' side of the "TATA box" is found an unusual sequence of 21 A's interrupted by a central G residue. It occupies the same relative position as the P. miliaris H2A gene-specific 5' dyad symmetry sequence and the "CCAAT box" seen in other eukaryotic polymerase II genes but is clearly different from both. A significant feature of the 3' non-coding region is the presence of a 23 base-pair sequence that is nearly identical to a conserved region found in sea urchin histone genes. The coding region is extremely GC rich, with strong selection for these bases in the third position of codons. Not a single coding triplet ends in U. No intervening sequences were found in this gene.  相似文献   

15.
It has been proposed that eukaryotic spliceosomes evolved from bacterial group II introns via constructive neutral changes. However, a more likely interpretation is that spliceosomes and group II introns share a common undefined RNA ancestor--a proto-spliceosome. Although, the constructive neutral evolution may have probably played some roles in the development of complexity including the evolution of modern spliceosomes, in fact, the origin, losses and the retention of spliceosomes can be explained straight-forwardly mainly by positive and negative selection: (1) proto-spliceosomes evolved in the RNA world as a mechanism to excise functional RNAs from an RNA genome and to join non-coding information (ancestral to exons) possibly designed to be degraded. (2) The complexity of proto-spliceosomes increased with the invention of protein synthesis in the RNP world and they were adopted for (a) the addition of translation signal to RNAs via trans-splicing, and for (b) the exon-shuffling such as to join together exons coding separate protein domains, to translate them as a single unit and thus to facilitate the molecular interaction of protein domains needed to be assembled to functional catalytic complexes. (3) Finally, the spliceosomes were adopted for cis-splicing of (mainly) non-coding information (contemporary introns) to yield translatable mRNAs. (4) Spliceosome-negative organisms (i.e., prokaryotes) have been selected in the DNA-protein world to save a lot of energy. (5) Spliceosome-positive organisms (i.e., eukaryotes) have been selected, because they have been completely spliceosome-dependent.  相似文献   

16.
A new twist in trypanosome RNA metabolism: cis-splicing of pre-mRNA   总被引:6,自引:1,他引:5       下载免费PDF全文
It has been known for almost a decade and a half that in trypanosomes all mRNAs are trans-spliced by addition to the 5' end of the spliced leader (SL) sequence. During the same time period the conviction developed that classical cis-splicing introns are not present in the trypanosome genome and that the trypanosome gene arrangement is highly compact with small intergenic regions separating one gene from the next. We have now discovered that these tenets are no longer true. Poly(A) polymerase (PAP) genes in Trypanosoma brucei and Trypanosoma cruzi are split by intervening sequences of 653 and 302 nt, respectively. The intervening sequences occur at identical positions in both organisms and obey the GT/AG rule of cis-splicing introns. PAP mRNAs are trans-spliced at the very 5' end as well as internally at the 3' splice site of the intervening sequence. Interestingly, 11 nucleotide positions past the actual 5' splice site are conserved between the T. bruceiand T. cruzi introns. Point mutations in these conserved positions, as well as in the AG dinucleotide of the 3' splice site, abolish intron removal in vivo. Our results, together with the recent discovery of cis-splicing introns in Euglena gracilis, suggest that both trans- and cis-splicing are ancient acquisitions of the eukaryotic cell.  相似文献   

17.
18.
19.
The complete human dihydrofolate reductase (DHFR) gene has been cloned from four recombinant lambda libraries constructed with the DNA from a methotrexate-resistant human cell line with amplified DHFR genes. The detailed organization of the gene has been determined by restriction mapping of the cloned fragments and DNA sequencing of all the protein coding regions and adjacent intron segments, and shown to correspond to that of the native human DHFR gene. The gene spans a length of approximately 29 X 10(3) bases from the ATG initiator codon to the end of the 3' untranslated region, and contains five introns that interrupt the protein coding sequence. The number and positions of introns are identical to those found in the mouse gene. By contrast, the size of the homologous introns (with the exception of the first one) varies greatly, up to several fold, in the genes from man, mouse and Chinese hamster; the intron sequences also exhibit a great divergence, except in the junction regions. A striking sequence homology, extending over several hundred nucleotides, exists between the human and mouse gene 5' non-coding regions. These regions are characterized by an unusually high G + C content, 72% and 66% in the human and mouse genes, respectively, which is maintained in the first coding segment and first intron, and is in sharp contrast to the relatively low G + C content (approximately 40%) of the remainder of the gene.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号