首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Prediction of splice junctions in mRNA sequences.   总被引:8,自引:6,他引:2       下载免费PDF全文
K Nakata  M Kanehisa    C DeLisi 《Nucleic acids research》1985,13(14):5327-5340
A general method based on the statistical technique of discriminant analysis is developed to distinguish boundaries of coding and non-coding regions in nucleic acid sequences. In particular, the method is applied to the prediction of splicing sites in messenger RNA precursors. Information used for discrimination includes consensus sequence patterns around splice junctions, free energy of snRNA and mRNA base pairing, and statistical differences between coding and non-coding regions such as periodic appearance of specific bases in coding regions reflecting the non-random usage of degenerate codons. Given the reading frame of an exon (but not the exon/intron boundaries), the method will predict the following exon, namely, the intron to be excised out. When applied to human sequences in the GenBank database, the method correctly identified 80% of true splice junctions.  相似文献   

3.
A general role for splicing enhancers in exon definition   总被引:6,自引:0,他引:6       下载免费PDF全文
Exonic splicing enhancers (ESEs) facilitate exon definition by assisting in the recruitment of splicing factors to the adjacent intron. Here we demonstrate that suboptimal 5' and 3' splice sites are activated independently by ESEs when they are located on different exons. However, when they are situated within a single exon, the same weak 5' and 3' splice sites are activated simultaneously by a single ESE. These findings demonstrate that a single ESE promotes the recognition of both exon/intron junctions within the same step during exon definition. Our results suggest that ESEs recruit a multicomponent complex that minimally contains components of the splicing machinery required for 5' and 3' splice site selection.  相似文献   

4.
In Drosophila melanogaster, synonymous codons corresponding to the most abundant cognate tRNAs are used more frequently, especially in highly expressed genes. Increased use of such "optimal" codons is considered an adaptation for translational efficiency. Need it always be the case that selection should favor the use of a translationally optimal codon? Here, we investigate one possible confounding factor, namely, the need to specify information in exons necessary to enable correct splicing. As expected from such a model, in Drosophila many codons show different usage near intron-exon boundaries versus exon core regions. However, this finding is in principle also consistent with Hill-Robertson effects modulating usage of translationally optimal codons. However, several results support the splice model over the translational selection model: 1) the trends in codon usage are strikingly similar to those in mammals in which codon usage near boundaries correlates with abundance in exonic splice enhancers (ESEs), 2) codons preferred near boundaries tend to be enriched for A and avoid C (conversely those avoided near boundaries prefer C rather than A), as expected were ESEs involved, and 3) codons preferred near boundaries are typically not translationally optimal. We conclude that usage of translationally optimal codons usage is compromised in the vicinity of splice junctions in intron-containing genes, to the effect that we observe higher levels of usage of translationally optimal codons at the center of exons. On the gene level, however, controlling for known correlates of codon bias, the impact on codon usage patterns is quantitatively small. These results have implications for inferring aspects of the mechanism of splicing given nothing more than a well-annotated genome.  相似文献   

5.

Background

In mammals, splice-regulatory domains impose marked trends on the relative abundance of certain amino acids near exon-intron boundaries. Is this a mammalian particularity or symptomatic of exonic splicing regulation across taxa? Are such trends more common in species that a priori have a harder time identifying exon ends, that is, those with pre-mRNA rich in intronic sequence? We address these questions surveying exon composition in a sample of phylogenetically diverse genomes.

Results

Biased amino acid usage near exon-intron boundaries is common throughout the metazoa but not restricted to the metazoa. There is extensive cross-species concordance as to which amino acids are affected, and reduced/elevated abundances are well predicted by knowledge of splice enhancers. Species expected to rely on exon definition for splicing, that is, those with a higher ratio of intronic to coding sequence, more introns per gene and longer introns, exhibit more amino acid skews. Notably, this includes the intron-rich basidiomycete Cryptococcus neoformans, which, unlike intron-poor ascomycetes (Schizosaccharomyces pombe, Saccharomyces cerevisiae), exhibits compositional biases reminiscent of the metazoa. Strikingly, 5 prime ends of nematode exons deviate radically from normality: amino acids strongly preferred near boundaries are strongly avoided in other species, and vice versa. This we suggest is a measure to avoid attracting trans-splicing machinery.

Conclusion

Constraints on amino acid composition near exon-intron boundaries are phylogenetically widespread and characteristic of species where exon localization should be problematic. That compositional biases accord with sequence preferences of splice-regulatory proteins and are absent in ascomycetes is consistent with selection on exonic splicing regulation.
  相似文献   

6.
We have discovered that positions of splice junctions in genes are constrained by the tolerance for disorder-promoting amino acids in the translated protein region. It is known that efficient splicing requires nucleotide bias at the splice junction; the preferred usage produces a distribution of amino acids that is disorder-promoting. We observe that efficiency of splicing, as seen in the amino-acid distribution, is not compromised to accommodate globular structure. Thus we infer that it is the positions of splice junctions in the gene that must be under constraint by the local protein environment. Examining exonic splicing enhancers found near the splice junction in the gene, reveals that these (short DNA motifs) are more prevalent in exons that encode disordered protein regions than exons encoding structured regions. Thus we also conclude that local protein features constrain efficient splicing more in structure than in disorder.  相似文献   

7.
Schizosaccharomyces pombe pre-mRNAs are generally multi-intronic and share certain features with pre-mRNAs from Drosophila melanogaster, in which initial splice site pairing can occur via either exon or intron definition. Here, we present three lines of evidence suggesting that, despite these similarities, fission yeast splicing is most likely restricted to intron definition. First, mutating either or both splice sites flanking an internal exon in the S. pombe cdc2 gene produced almost exclusively intron retention, in contrast to the exon skipping observed in vertebrates. Second, we were unable to induce skipping of the internal microexon in fission yeast cgs2, whereas the default splicing pathway excludes extremely small exons in mammals. Because nearly quantitative removal of the downstream intron in cgs2 could be achieved by expanding the microexon, we propose that its retention is due to steric occlusion. Third, several cryptic 5' junctions in the second intron of fission yeast cdc2 are located within the intron, in contrast to their generally exonic locations in metazoa. The effects of expanding and contracting this intron are as predicted by intron definition; in fact, even highly deviant 5' junctions can compete effectively with the standard 5' splice site if they are closer to the 3' splicing signals. Taken together, our data suggest that pairing of splice sites in S. pombe most likely occurs exclusively across introns in a manner that favors excision of the smallest segment possible.  相似文献   

8.
A new analytical method has been used to examine the set of40 exon/intron boundaries within the rat embryonic myosin heavychain (MHCemb) gene. It has also been applied to an additionalset of 850 splice sequences selected from GenBank. Strong evidenceis obtained for the involvement of 3' ends but not 5' ends ofexon sequences in splice site recognition. It can be determinedthat signal sequences of 5' intron ends concentrate near thesplice borders, while the distributions of the 3' intron endshave a diffuse character. The possibility of re-interpretingsome known features, in terms of the absence of certain elementsrather than the presence of elements forming sequence determinants,is discussed. The analysis undertaken enabled us to work outa more detailed set of recognition sequence requirements forthe splicing of nuclear pre-mRNA. In addition to requirementswhich have already been established we suggest the following:the ‘AG-absence’ in the immediate 3' terminal intronsequences; and a minimal match between a particular sequenceand the known exon/intron consensus sequence of 5' splice junctions. Received on March 22, 1988; accepted on November 19, 1988  相似文献   

9.
10.
In mammals there is a bias in amino acid usage near splice sites that is explained, in large part, by the high density of exonic splicing enhancers (ESEs) in these regions. Is there a similar bias for the relative use of synonymous codons, and can any such bias be predicted by their abundance in ESEs? Prior reports suggested that such trends may exist. From analysis of human exons, we find that 47 of the 59 codons with at least one synonym show differential usage in the proximity of exon ends, of which 42 remain significant after correction for multiple testing. Within sets of synonymous codons those more preferred near splice sites are generally those that are relatively more abundant within the ESEs. However, the examples given previously appear exceptionally good fits and there exist many exceptions, the usage of lysine's codons being a case in point. Similar results are observed in mouse exons. We conclude that splice regulation impacts on the choice of synonymous codons in mammals, but the magnitude of this effect is less than might at first have been supposed.  相似文献   

11.
A comparison of the nucleotide sequences around the splice junctions that flank old (shared by two or more major lineages of eukaryotes) and new (lineage-specific) introns in eukaryotic genes reveals substantial differences in the distribution of information between introns and exons. Old introns have a lower information content in the exon regions adjacent to the splice sites than new introns but have a corresponding higher information content in the intron itself. This suggests that introns insert into nonrandom (proto-splice) sites but, during the evolution of an intron after insertion, the splice signal shifts from the flanking exon regions to the ends of the intron itself. Accumulation of information inside the intron during evolution suggests that new introns largely emerge de novo rather than through propagation and migration of old introns.  相似文献   

12.

Background

In humans, much of the information specifying splice sites is not at the splice site. Exonic splice enhancers are one of the principle non-splice site motifs. Four high-throughput studies have provided a compendium of motifs that function as exonic splice enhancers, but only one, RESCUE-ESE, has been generally employed to examine the properties of enhancers. Here we consider these four datasets to ask whether there is any consensus on the properties and impacts of exonic splice enhancers.

Results

While only about 1% of all the identified hexamer motifs are common to all analyses we can define reasonably sized sets that are found in most datasets. These consensus intersection datasets we presume reflect the true properties of exonic splice enhancers. Given prior evidence for the properties of enhancers and splice-associated mutations, we ask for all datasets whether the exonic splice enhancers considered are purine enriched; enriched near exon boundaries; able to predict trends in relative codon usage; slow evolving at synonymous sites; rare in SNPs; associated with weak splice sites; and enriched near longer introns. While the intersect datasets match expectations, only one original dataset, RESCUE-ESE, does. Unexpectedly, a fully experimental dataset identifies motifs that commonly behave opposite to the consensus, for example, being enriched in exon cores where splice-associated mutations are rare.

Conclusions

Prior analyses that used the RESCUE-ESE set of hexamers captured the properties of consensus exonic splice enhancers. We estimate that at least 4% of synonymous mutations are deleterious owing to an effect on enhancer functioning.  相似文献   

13.
14.
The present century has witnessed an unprecedented rise in genome sequences owing to various genome-sequencing programs. However, the same has not been replicated with cDNA or expressed sequence tags (ESTs). Hence, prediction of protein coding sequence of genes from this enormous collection of genomic sequences presents a significant challenge. While robust high throughput methods of cloning and expression could be used to meet protein requirements, lack of intron information creates a bottleneck. Computational programs designed for recognizing intron–exon boundaries for a particular organism or group of organisms have their own limitations. Keeping this in view, we describe here a method for construction of intron-less gene from genomic DNA in the absence of cDNA/EST information and organism-specific gene prediction program. The method outlined is a sequential application of bioinformatics to predict correct intron–exon boundaries and splicing by overlap extension PCR for spliced gene synthesis. The gene construct so obtained can then be cloned for protein expression. The method is simple and can be used for any eukaryotic gene expression.  相似文献   

15.
The genes encoding mouse and human acetylcholinesterases have been cloned from genomic and cosmid libraries. Restriction analysis and a comparison of sequence with the cDNAs have defined the exon-intron boundaries. In mammals, three invariant exons encode the signal peptide and the amino-terminal 535 amino acids common to all forms of the enzyme whereas alternative exon usage of the next exon accounts for the structural divergence in the carboxyl termini of the catalytic subunits. mRNA protection studies show that the cDNA encoding the hydrophilic catalytic subunits represents the dominant mRNA species in mammalian brain and muscle whereas divergent mRNA species are evident in cells of hematopoietic origin (bone marrow cells and a erythroleukemia cell line). Analyses of mRNA species in these cells and the genomic sequence have enabled us to define two alternative exons in addition to the one found in the cDNAs; they encode unique carboxyl-terminal sequences. One mRNA consists of a direct extension through the intervening sequence between the common exon and the 3' exon deduced from the cDNA. This sequence encodes a subunit lacking the cysteine critical to oligomer formation. Another mRNA results from a splice that encodes a stretch of hydrophobic amino acids immediately upstream of a stop codon. This exon, when spliced to the upstream invariant exons, should encode glycophospholipid-linked species of the enzyme. Homologous sequence, identity of exon-intron junctions, and identity of position of the stop codon are seen for this region in mouse and human. Polymerase chain reactions carried out across the expected intron region and mRNA protection studies show that this splice occurs in mouse bone marrow and erythroleukemia cells yielding the appropriate cDNA.  相似文献   

16.
The Su(var)205 gene of Drosophila melanogaster encodes heterochromatin protein 1 (HP1), a protein located preferentially within beta-heterochromatin. Mutation of this gene has been associated with dominant suppression of position-effect variegation. We have cloned and sequenced the gene encoding HP1 from Drosophila virilis, a distantly related species. Comparison of the predicted amino acid sequence with Drosophila melanogaster HP1 shows two regions of strong homology, one near the N-terminus (57/61 amino acids identical) and the other near the C-terminus (62/68 amino acids identical) of the protein. Little homology is seen in the 5' and 3' untranslated portions of the gene, as well as in the intronic sequences, although intron/exon boundaries are generally conserved. A comparison of the deduced amino acid sequences of HP1-like proteins from other species shows that the cores of the N-terminal and C-terminal domains have been conserved from insects to mammals. The high degree of conservation suggests that these N- and C-terminal domains could interact with other macromolecules in the formation of the condensed structure of heterochromatin.  相似文献   

17.
Introns are flanked by a partially conserved coding sequence that forms the immediate exon junction sequence following intron removal from pre-mRNA. Phylogenetic evidence indicates that these sequences have been targeted by numerous intron insertions during evolution, but little is known about this process. Here, we test the prediction that exon junction sequences were functional splice sites that existed in the coding sequence of genes prior to the insertion of introns. To do this, we experimentally identified nine cryptic splice sites within the coding sequence of actin genes from humans, Arabidopsis, and Physarum by inactivating their normal intron splice sites. We found that seven of these cryptic splice sites correspond exactly to the positions of exon junctions in actin genes from other species. Because actin genes are highly conserved, we could conclude that at least seven actin introns are flanked by cryptic splice sites, and from the phylogenetic evidence, we could also conclude that actin introns were inserted into these cryptic splice sites during evolution. Furthermore, our results indicate that these insertion events were dependent upon the splicing machinery. Because most introns are flanked by similar sequences, our results are likely to be of general relevance.  相似文献   

18.
Isovaleric acidemia (IVA) is a recessive disorder caused by a deficiency of isovaleryl-CoA dehydrogenase (IVD). We have reported elsewhere nine point mutations in the IVD gene in fibroblasts of patients with IVA, which lead to abnormalities in IVD protein processing and activity. In this report, we describe eight IVD gene mutations identified in seven IVA patients that result in abnormal splicing of IVD RNA. Four mutations in the coding region lead to aberrantly spliced mRNA species in patient fibroblasts. Three of these are amino acid altering point mutations, whereas one is a single-base insertion that leads to a shift in the reading frame of the mRNA. Two of the coding mutations strengthen pre-existing cryptic splice acceptors adjacent to the natural splice junctions and apparently interfere with exon recognition, resulting in exon skipping. This mechanism for missplicing has not been reported elsewhere. Four other mutations alter either the conserved gt or ag dinucleotide splice sites in the IVD gene. Exon skipping and cryptic splicing were confirmed by transfection of these mutations into a Cos-7 cell line model splicing system. Several of the mutations were predicted by individual information analysis to inactivate or significantly weaken adjacent donor or acceptor sites. The high frequency of splicing mutations identified in these patients is unusual, as is the finding of missplicing associated with missense mutations in exons. These results may lead to a better understanding of the phenotypic complexity of IVA, as well as provide insight into those factors important in defining intron/exon boundaries in vivo.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号