首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
De novo origin of coding sequence remains an obscure issue in molecular evolution. One of the possible paths for addition (subtraction) of DNA segments to (from) a gene is stop codon shift. Single nucleotide substitutions can destroy the existing stop codon, leading to uninterrupted translation up to the next stop codon in the gene’s reading frame, or create a premature stop codon via a nonsense mutation. Furthermore, short indels-caused frameshifts near gene’s end may lead to premature stop codons or to translation past the existing stop codon. Here, we describe the evolution of the length of coding sequence of prokaryotic genes by change of positions of stop codons. We observed cases of addition of regions of 3′UTR to genes due to mutations at the existing stop codon, and cases of subtraction of C-terminal coding segments due to nonsense mutations upstream of the stop codon. Many of the observed stop codon shifts cannot be attributed to sequencing errors or rare deleterious variants segregating within bacterial populations. The additions of regions of 3′UTR tend to occur in those genes in which they are facilitated by nearby downstream in-frame triplets which may serve as new stop codons. Conversely, subtractions of coding sequence often give rise to in-frame stop codons located nearby. The amino acid composition of the added region is significantly biased, compared to the overall amino acid composition of the genes. Our results show that in prokaryotes, shift of stop codon is an underappreciated contributor to functional evolution of gene length.  相似文献   

2.
Castle JC 《PloS one》2011,6(6):e20660
Rates of SNPs (single nucleotide polymorphisms) and cross-species genomic sequence conservation reflect intra- and inter-species variation, respectively. Here, I report SNP rates and genomic sequence conservation adjacent to mRNA processing regions and show that, as expected, more SNPs occur in less conserved regions and that functional regions have fewer SNPs. Results are confirmed using both mouse and human data. Regions include protein start codons, 3' splice sites, 5' splice sites, protein stop codons, predicted miRNA binding sites, and polyadenylation sites. Throughout, SNP rates are lower and conservation is higher at regulatory sites. Within coding regions, SNP rates are highest and conservation is lowest at codon position three and the fewest SNPs are found at codon position two, reflecting codon degeneracy for amino acid encoding. Exon splice sites show high conservation and very low SNP rates, reflecting both splicing signals and protein coding. Relaxed constraint on the codon third position is dramatically seen when separating exonic SNP rates based on intron phase. At polyadenylation sites, a peak of conservation and low SNP rate occurs from 30 to 17 nt preceding the site. This region is highly enriched for the sequence AAUAAA, reflecting the location of the conserved polyA signal. miRNA 3' UTR target sites are predicted incorporating interspecies genomic sequence conservation; SNP rates are low in these sites, again showing fewer SNPs in conserved regions. Together, these results confirm that SNPs, reflecting recent genetic variation, occur more frequently in regions with less evolutionarily conservation.  相似文献   

3.
4.
In recent years, the amount of molecular sequencing data from Tetrahymena thermophila has dramatically increased. We analyzed G + C content, codon usage, initiator codon context and stop codon sites in the extremely A + T rich genome of this ciliate. Average G + C content was 38% for protein coding regions, 21% for 5' non-coding sequences, 19% for 3' non-coding sequences, 15% for introns, 19% for micronuclear limited sequences and 17% for macronuclear retained sequences flanking micronuclear specific regions. The 75 available T. thermophila protein coding sequences favored codons ending in T and, where possible, avoided those with G in the third position. Highly expressed genes were relatively G + C-rich and exhibited an extremely biased pattern of codon usage while developmentally regulated genes were more A + T-rich and showed less codon usage bias. Regions immediately preceding Tetrahymena translation initiator codons were generally A-rich. For the 60 stop codons examined, the frequency of G in the end + 1 site was much higher than expected whereas C never occupied this position.  相似文献   

5.
The genomic RNAs of flaviviruses such as dengue virus (DEN) have a 5' m7GpppN cap like those of cellular mRNAs but lack a 3' poly(A) tail. We have studied the contributions to translational expression of 5'- and 3'-terminal regions of the DEN serotype 2 genome by using luciferase reporter mRNAs transfected into Vero cells. DCLD RNA contained the entire DEN 5' and 3' untranslated regions (UTRs), as well as the first 36 codons of the capsid coding region fused to the luciferase reporter gene. Capped DCLD RNA was as efficiently translated in Vero cells as capped GLGpA RNA, a reporter with UTRs from the highly expressed alpha-globin mRNA and a 72-residue poly(A) tail. Analogous reporter RNAs with regulatory sequences from West Nile and Sindbis viruses were also strongly expressed. Although capped DCLD RNA was expressed much more efficiently than its uncapped form, uncapped DCLD RNA was translated 6 to 12 times more efficiently than uncapped RNAs with UTRs from globin mRNA. The 5' cap and DEN 3' UTR were the main sources of the translational efficiency of DCLD RNA, and they acted synergistically in enhancing translation. The DEN 3' UTR increased mRNA stability, although this effect was considerably weaker than the enhancement of translational efficiency. The DEN 3' UTR thus has translational regulatory properties similar to those of a poly(A) tail. Its translation-enhancing effect was observed for RNAs with globin or DEN 5' sequences, indicating no codependency between viral 5' and 3' sequences. Deletion studies showed that translational enhancement provided by the DEN 3' UTR is attributable to the cumulative contributions of several conserved elements, as well as a nonconserved domain adjacent to the stop codon. One of the conserved elements was the conserved sequence (CS) CS1 that is complementary to cCS1 present in the 5' end of the DEN polyprotein open reading frame. Complementarity between CS1 and cCS1 was not required for efficient translation.  相似文献   

6.
Properties of mRNA leading regions that modulate protein synthesis are little known (besides effects of their secondary structure). Here I explore how coding properties of leading regions may account for their disparate efficiencies. Trinucleotides that form off frame stop codons decrease costs of ribosomal slippages during protein synthesis: protein activity (as a proxy of gene expression, and as measured in experiments using artificial variants of 5' leading sequences of beta galactosidase in Escherichia coli) increases proportionally to the number of stop motifs in any frame in the 5' leading region. This suggests that stop codons in the 5' leading region, upstream of the recognized coding sequence, terminate eventual translations that sometimes start before ribosomes reach the mRNA's recognized start codon, increasing efficiency. This hypothesis is confirmed by further analyses: mRNAs with 5' leading regions containing in the same frame a start preceding a stop codon (in any frame) produce less enzymatic activity than those with the stop preceding the start. Hence coding properties, in addition to other properties, such as the secondary structure of the 5' leading region, regulate translation. This experimentally (a) confirms that within coding regions, off frame stops increase protein synthesis efficiency by early stopping frameshifted translation; (b) suggests that this occurs for all frames also in 5' leading regions and that (c) several alternative start codons that function at different probabilities should routinely be considered for all genes in the region of the recognized initiation codon. An unknown number of short peptides might be translated from coding and non-coding regions of RNAs.  相似文献   

7.
Nonsense-mediated decay (NMD) is a eukaryotic cellular RNA surveillance and quality-control mechanism that degrades mRNA containing premature stop codons (nonsense mutations) that otherwise may exert a deleterious effect by the production of dysfunctional truncated proteins. Collagen X (COL10A1) nonsense mutations in Schmid-type metaphyseal chondrodysplasia are localized in a region toward the 3' end of the last exon (exon 3) and result in mRNA decay, in contrast to most other genes in which terminal-exon nonsense mutations are resistant to NMD. We introduce nonsense mutations into the mouse Col10a1 gene and express these in a hypertrophic-chondrocyte cell line to explore the mechanism of last-exon mRNA decay of Col10a1 and demonstrate that mRNA decay is spatially restricted to mutations occurring in a 3' region of the exon 3 coding sequence; this region corresponds to where human mutations have been described. This localization of mRNA-decay competency suggested that a downstream region, such as the 3' UTR, may play a role in specifying decay of mutant Col10a1 mRNA containing nonsense mutations. We found that deleting any of the three conserved sequence regions within the 3' UTR (region I, 23 bp; region II, 170 bp; and region III, 76 bp) prevented mutant mRNA decay, but a smaller 13 bp deletion within region III was permissive for decay. These data suggest that the 3' UTR participates in collagen X last-exon mRNA decay and that overall 3' UTR configuration, rather than specific linear-sequence motifs, may be important in specifying decay of Col10a1 mRNA containing nonsense mutations.  相似文献   

8.
Transterm facilitates studies of messenger RNAs and translational control signals. Each messenger RNA (mRNA) from GenBank is extracted and broken into its functional components, its coding sequence, initiation context, termination context, flanking sequence representing its 5' UTR (untranslated region), 3' UTR and translational signals. In addition, numerical parameters characterising each coding region in Transterm, including codon and GC bias, are available. For each species in Transterm, the initiation and termination regions are aligned by their start or stop codons and presented as base frequency matrices and tables of the information content of the bases in the alignments. Users can obtain summaries of characteristics of the mRNAs for species of their choice and search for translational signals both in the Transterm database and in their own sequence. The current release contains data from over 10 000 species, including the complete genomes of 20 prokaryotes and three eukaryotes. Both flat-file and relational database forms of Transterm are accessible via the WWW at http://biochem.otago.ac.nz/Transterm/  相似文献   

9.
10.
11.
12.
The mRNA surveillance system is known to rapidly degrade aberrant mRNAs that contain premature termination codons in a process referred to as nonsense-mediated decay. A second class of aberrant mRNAs are those wherein the 3' UTR is abnormally extended due to a mutation in the polyadenylation site. We provide several observations that these abnormally 3'-extended mRNAs are degraded by the same machinery that degrades mRNAs with premature nonsense codons. First, the decay of the 3'-extended mRNAs is dependent on the same decapping enzyme and 5'-to-3' exonuclease. Second, the decay is also dependent on the proteins encoded by the UPF1, UPF2, and UPF3 genes, which are known to be specifically required for the rapid decay of mRNAs containing nonsense codons. Third, the ability of an extended 3' UTR to trigger decay is prevented by stabilizing sequences within the PGK1 coding region that are known to protect mRNAs from the rapid decay induced by premature nonsense codons. These results indicate that the mRNA surveillance system plays a role in degrading abnormally extended 3' UTRs. Based on these results, we propose a model in which the mRNA surveillance machinery degrades aberrant mRNAs due to the absence of the proper spatial arrangement of the translation-termination codon with respect to the 3' UTR element as defined by the utilization of a polyadenylation site.  相似文献   

13.
14.
Classical swine fever virus (CSFV) is a member of the pestivirus family, which shares many features in common with hepatitis C virus (HCV). It is shown here that CSFV has an exceptionally efficient cis-acting internal ribosome entry segment (IRES), which, like that of HCV, is strongly influenced by the sequences immediately downstream of the initiation codon, and is optimal with viral coding sequences in this position. Constructs that retained 17 or more codons of viral coding sequence exhibited full IRES activity, but with only 12 codons, activity was approximately 66% of maximum in vitro (though close to maximum in transfected BHK cells), whereas with just 3 codons or fewer, the activity was only approximately 15% of maximum. The minimal coding region elements required for high activity were exchanged between HCV and CSFV. Although maximum activity was observed in each case with the homologous combination of coding region and 5' UTR, the heterologous combinations were sufficiently active to rule out a highly specific functional interplay between the 5' UTR and coding sequences. On the other hand, inversion of the coding sequences resulted in low IRES activity, particularly with the HCV coding sequences. RNA structure probing showed that the efficiency of internal initiation of these chimeric constructs correlated most closely with the degree of single-strandedness of the region around and immediately downstream of the initiation codon. The low activity IRESs could not be rescued by addition of supplementary eIF4A (the initiation factor with ATP-dependent RNA helicase activity). The extreme sensitivity to secondary structure around the initiation codon is likely to be due to the fact that the eIF4F complex (which has eIF4A as one of its subunits) is not required for and does not participate in initiation on these IRESs.  相似文献   

15.
Type 2 deiodinase (D2) is a low Km iodothyronine deiodinase that metabolizes thyroxine (T4) to the active metabolite T3. We have recently shown that the cDNA for the human D2 coding region contains two in-frame selenocysteine (TGA) codons. The 3' TGA is seven codons 5' to a universal stop codon, TAA. The human D2 enzyme, transiently expressed in HEK-293 cells, can be in vivo labeled with 75Se as a doublet of approximately 31 kDa. This doublet is consistent with the possibility that the carboxy-terminal TGA codon can either encode selenocysteine or function as a stop codon. To test this hypothesis we mutagenized the second selenocysteine codon to a cysteine (TGC) or to an unambiguous stop codon (TAA). While the selenium incorporation pattern is different between the wild-type and mutant proteins, the deiodination properties of the enzyme are not affected by mutating the 3'TGA codon. Thus, we conclude that neither this residue nor the remaining seven carboxy-terminal amino acids are critical for the deiodination process.  相似文献   

16.
Eukaryotic cells target mRNAs to the nonsense-mediated mRNA decay (NMD) pathway when translation terminates within the coding region. In mammalian cells, this is presumably due to a downstream signal deposited during pre-mRNA splicing. In contrast, unspliced retroviral RNA undergoes NMD in chicken cells when premature termination codons (PTCs) are present in the gag gene. Surprisingly, deletion of a 401-nt 3' UTR sequence immediately downstream of the normal gag termination codon caused this termination event to be recognized as premature. We termed this 3' UTR region the Rous sarcoma virus (RSV) stability element (RSE). The RSE also stabilized the viral RNA when placed immediately downstream of a PTC in the gag gene. Deletion analysis of the RSE indicated a smaller functional element. We conclude that this 3' UTR sequence stabilizes termination codons in the RSV RNA, and termination codons not associated with such an RSE sequence undergo NMD.  相似文献   

17.
18.
The nearest 5' context of 2559 human stop codons was analysed in comparison with the same context of stop-like codons (UGG, UGC, UGU, CGA for UGA; CAA, UAU, UAC for UAA; and UGG, UAU, UAC, CAG for UAG). The non-random distribution of some nucleotides upstream of the stop codons was observed. For instance, uridine is over-represented in position -3 upstream of UAG. Several codons were shown to be over-represented immediately upstream of the stop codons: UUU(Phe), AGC(Ser), and the Lys and Ala codon families before UGA; AAG(Lys), GCG(Ala), and the Ser and Leu codon families before UAA; and UCA(Ser), AUG(Met), and the Phe codon family before UAG. In contrast, the Thr and Gly codon families were under-represented before UGA, while ACC(Thr) and the Gly codon family were under-represented before UAG and UAA respectively. In an earlier study, uridine was shown to be over-represented in position -3 before UGA in Escherichia coli [Arkov,A.L., Korolev,S.V. and Kisselev,L.L. (1993) Nucleic Acids Res., 21,2891-2897]. In that study, the codons for Lys, Phe and Ser were shown to be over-represented immediately upstream of E. coli stop codons. Consequently, E. coli and human termination codons have similar 5' contexts. The present study suggests that the 5' context of stop codons may modulate the efficiency of peptide chain termination and (or) stop codon readthrough in higher eukaryotes, and that the mechanisms of such a modulation in prokaryotes and higher eukaryotes may be very similar.  相似文献   

19.
Distribution of 1000 sequenced T-DNA tags in the Arabidopsis genome   总被引:6,自引:0,他引:6  
Induction of knockout mutations by T-DNA insertion mutagenesis is widely used in studies of plant gene functions. To assess the efficiency of this genetic approach, we have sequenced PCR amplified junctions of 1000 T-DNA insertions and analysed their distribution in the Arabidopsis genome. Map positions of 973 tags could be determined unequivocally, indicating that the majority of T-DNA insertions landed in chromosomal domains of high gene density. Only 4.7% of insertions were found in interspersed, centromeric, telomeric and rDNA repeats, whereas 0.6% of sequenced tags identified chromosomally integrated segments of organellar DNAs. 35.4% of T-DNAs were localized in intervals flanked by ATG and stop codons of predicted genes, showing a distribution of 62.2% in exons and 37.8% in introns. The frequency of T-DNA tags in coding and intergenic regions showed a good correlation with the predicted size distribution of these sequences in the genome. However, the frequency of T-DNA insertions in 3'- and 5'-regulatory regions of genes, corresponding to 300 bp intervals 3' downstream of stop and 5' upstream of ATG codons, was 1.7-2.3-fold higher than in any similar interval elsewhere in the genome. The additive frequency of insertions in 5'-regulatory regions and coding domains provided an estimate for the mutation rate, suggesting that 47.8% of mapped T-DNA tags induced knockout mutations in Arabidopsis.  相似文献   

20.
The processes accompanying endosymbiosis have led to a complex network of interorganellar protein traffic that originates from nuclear genes encoding mitochondrial and plastid proteins. A significant proportion of nucleus-encoded organellar proteins are dual targeted, and the process by which a protein acquires the capacity for both mitochondrial and plastid targeting may involve intergenic DNA exchange coupled with the incorporation of sequences residing upstream of the gene. We evaluated targeting and sequence alignment features of two organellar DNA polymerase genes from Arabidopsis thaliana. Within one of these two loci, protein targeting appeared to be plastidic when the 5' untranslated leader region (UTR) was deleted and translation could only initiate at the annotated ATG start codon but dual targeted when the 5' UTR was included. Introduction of stop codons at various sites within the putative UTR demonstrated that this region is translated and influences protein targeting capacity. However, no ATG start codon was found within this upstream, translated region, suggesting that translation initiates at a non-ATG start. We identified a CTG codon that likely accounts for much of this initiation. Investigation of the 5' region of other nucleus-encoded organellar genes suggests that several genes may incorporate upstream sequences to influence targeting capacity. We postulate that a combination of intergenic recombination and some relaxation of constraints on translation initiation has acted in the evolution of protein targeting specificity for those proteins capable of functioning in both plastids and mitochondria.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号