首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A new method which predicts internal exon sequences in human DNA has been developed. The method is based on a splice site prediction algorithm that uses the linear discriminant function to combine information about significant triplet frequencies of various functional parts of splice site regions and preferences of oligonucleotides in protein coding and intron regions. The accuracy of our splice site recognition function is 97% for donor splice sites and 96% for acceptor splice sites. For exon prediction, we combine in a discriminant function the characteristics describing the 5'-intron region, donor splice site, coding region, acceptor splice site and 3'-intron region for each open reading frame flanked by GT and AG base pairs. The accuracy of precise internal exon recognition on a test set of 451 exon and 246693 pseudoexon sequences is 77% with a specificity of 79%. The recognition quality computed at the level of individual nucleotides is 89% for exon sequences and 98% for intron sequences. This corresponds to a correlation coefficient for exon prediction of 0.87. The precision of this approach is better than other methods and has been tested on a larger data set. We have also developed a means for predicting exon-exon junctions in cDNA sequences, which can be useful for selecting optimal PCR primers.  相似文献   

2.
Introns are flanked by a partially conserved coding sequence that forms the immediate exon junction sequence following intron removal from pre-mRNA. Phylogenetic evidence indicates that these sequences have been targeted by numerous intron insertions during evolution, but little is known about this process. Here, we test the prediction that exon junction sequences were functional splice sites that existed in the coding sequence of genes prior to the insertion of introns. To do this, we experimentally identified nine cryptic splice sites within the coding sequence of actin genes from humans, Arabidopsis, and Physarum by inactivating their normal intron splice sites. We found that seven of these cryptic splice sites correspond exactly to the positions of exon junctions in actin genes from other species. Because actin genes are highly conserved, we could conclude that at least seven actin introns are flanked by cryptic splice sites, and from the phylogenetic evidence, we could also conclude that actin introns were inserted into these cryptic splice sites during evolution. Furthermore, our results indicate that these insertion events were dependent upon the splicing machinery. Because most introns are flanked by similar sequences, our results are likely to be of general relevance.  相似文献   

3.
A new analytical method has been used to examine the set of40 exon/intron boundaries within the rat embryonic myosin heavychain (MHCemb) gene. It has also been applied to an additionalset of 850 splice sequences selected from GenBank. Strong evidenceis obtained for the involvement of 3' ends but not 5' ends ofexon sequences in splice site recognition. It can be determinedthat signal sequences of 5' intron ends concentrate near thesplice borders, while the distributions of the 3' intron endshave a diffuse character. The possibility of re-interpretingsome known features, in terms of the absence of certain elementsrather than the presence of elements forming sequence determinants,is discussed. The analysis undertaken enabled us to work outa more detailed set of recognition sequence requirements forthe splicing of nuclear pre-mRNA. In addition to requirementswhich have already been established we suggest the following:the ‘AG-absence’ in the immediate 3' terminal intronsequences; and a minimal match between a particular sequenceand the known exon/intron consensus sequence of 5' splice junctions. Received on March 22, 1988; accepted on November 19, 1988  相似文献   

4.
A comparison of the nucleotide sequences around the splice junctions that flank old (shared by two or more major lineages of eukaryotes) and new (lineage-specific) introns in eukaryotic genes reveals substantial differences in the distribution of information between introns and exons. Old introns have a lower information content in the exon regions adjacent to the splice sites than new introns but have a corresponding higher information content in the intron itself. This suggests that introns insert into nonrandom (proto-splice) sites but, during the evolution of an intron after insertion, the splice signal shifts from the flanking exon regions to the ends of the intron itself. Accumulation of information inside the intron during evolution suggests that new introns largely emerge de novo rather than through propagation and migration of old introns.  相似文献   

5.
6.
7.
8.
Cloning and characterization of the human beta-glucuronidase gene   总被引:2,自引:0,他引:2  
We have isolated a cosmid clone that contains GUSB, the human gene encoding beta-glucuronidase. The 21-kb gene contains 12 exons ranging from 85 to 376 bp in length. Exon 6 corresponds to the 153-bp deletion in the shorter of two types of cDNAs reported earlier, supporting the hypothesis that this cDNA arose by alternate splicing leading to exon skipping. The insert contains 4.2 kb of sequence upstream from the first exon and 6 kb 3' of the last exon. The clone expresses human beta-glucuronidase in stably transformed rat XCtk- cells. Comparison of the human gene organization with that recently reported for the murine beta-glucuronidase gene revealed that the intron/exon boundaries are identical. In the splice junctions, the most highly conserved regions are those identified as consensus sequences, and these are at least as highly conserved as bases encoding the translated portion of the gene.  相似文献   

9.
The primary structure of human glutathione reductase gene (GSR) was determined by genomic cloning. The gene structure of human GSR spans 50 kb, consists of 13 exons, and was found to be highly similar to the mouse GSR gene. The coding sequence of human GSR resides on all 13 exons. An N-terminal arginine-rich mitochondrial leader sequence was present, with high homology to the murine leader sequence, between two in-frame start codons in the first exon. The 5' and 3' intron/exon splice junctions, with one exception, followed the general consensus sequences for intron spliced donor and acceptance sites.  相似文献   

10.
Prediction of splice sites in non-coding regions of genes is one of the most challenging aspects of gene structure recognition. We perform a rigorous analysis of such splice sites embedded in human 5' untranslated regions (UTRs), and investigate correlations between this class of splice sites and other features found in the adjacent exons and introns. By restricting the training of neural network algorithms to 'pure' UTRs (not extending partially into protein coding regions), we for the first time investigate the predictive power of the splicing signal proper, in contrast to conventional splice site prediction, which typically relies on the change in sequence at the transition from protein coding to non-coding. By doing so, the algorithms were able to pick up subtler splicing signals that were otherwise masked by 'coding' noise, thus enhancing significantly the prediction of 5' UTR splice sites. For example, the non-coding splice site predicting networks pick up compositional and positional bias in the 3' ends of non-coding exons and 5' non-coding intron ends, where cytosine and guanine are over-represented. This compositional bias at the true UTR donor sites is also visible in the synaptic weights of the neural networks trained to identify UTR donor sites. Conventional splice site prediction methods perform poorly in UTRs because the reading frame pattern is absent. The NetUTR method presented here performs 2-3-fold better compared with NetGene2 and GenScan in 5' UTRs. We also tested the 5' UTR trained method on protein coding regions, and discovered, surprisingly, that it works quite well (although it cannot compete with NetGene2). This indicates that the local splicing pattern in UTRs and coding regions is largely the same. The NetUTR method is made publicly available at www.cbs.dtu.dk/services/NetUTR.  相似文献   

11.
12.
Splicing and the evolution of proteins in mammals   总被引:3,自引:0,他引:3  
It is often supposed that a protein's rate of evolution and its amino acid content are determined by the function and anatomy of the protein. Here we examine an alternative possibility, namely that the requirement to specify in the unprocessed RNA, in the vicinity of intron–exon boundaries, information necessary for removal of introns (e.g., exonic splice enhancers) affects both amino acid usage and rates of protein evolution. We find that the majority of amino acids show skewed usage near intron–exon boundaries, and that differences in the trends for the 2-fold and 4-fold blocks of both arginine and leucine show this to be owing to effects mediated at the nucleotide level. More specifically, there is a robust relationship between the extent to which an amino acid is preferred/avoided near boundaries and its enrichment/paucity in splice enhancers. As might then be expected, the rate of evolution is lowest near intron–exon boundaries, at least in part owing to splice enhancers, such that domains flanking intron–exon junctions evolve on average at under half the rate of exon centres from the same gene. In contrast, the rate of evolution of intronless retrogenes is highest near the domains where intron–exon junctions previously resided. The proportion of sequence near intron–exon boundaries is one of the stronger predictors of a protein's rate of evolution in mammals yet described. We conclude that after intron insertion selection favours modification of amino acid content near intron–exon junctions, so as to enable efficient intron removal, these changes then being subject to strong purifying selection even if nonoptimal for protein function. Thus there exists a strong force operating on protein evolution in mammals that is not explained directly in terms of the biology of the protein.  相似文献   

13.
Disease causing aberrations in both tuberous sclerosis predisposing genes, TSC1 and TSC2, comprise nearly every type of alteration with a predominance of small truncating mutations distributed over both genes. We performed an RNA based screening of the entire coding regions of both TSC genes applying the protein truncation test (PTT) and identified a high proportion of unusual splicing abnormalities affecting the TSC2 gene. Two cases exhibited different splice acceptor mutations in intron 9 (IVS9-15G-->A and IVS9-3C-->G) both accompanied by exon 10 skipping and simultaneous usage of a cryptic splice acceptor in exon 10. Another splice acceptor mutation (IVS38-18A-->G) destroyed the putative polypyrimidine structure in intron 38 and resulted in simultaneous intron retention and usage of a downstream cryptic splice acceptor in exon 39. Another patient bore a C-->T transition in intron 8 (IVS8+281C-->T) activating a splice donor site and resulting in the inclusion of a newly recognised exon in the mRNA followed by a premature stop. These splice variants deduced from experimental results are additionally supported by RNA secondary structure analysis based on free energy minimisation. Three of the reported splicing anomalies are due to sequence changes remote from exon/intron boundaries, described for the first time in TSC. These findings highlight the significance of investigating intronic changes and their consequences on the mRNA level as disease causing mutations in TSC.  相似文献   

14.
MOTIVATION: Alternative splicing is currently seen to explain the vast disparity between the number of predicted genes in the human genome and the highly diverse proteome. The mapping of expressed sequences tag (EST) consensus sequences derived from the GeneNest database onto the genome provides an efficient way of predicting exon-intron boundaries, gene structure and alternative splicing events. However, the alternative splicing events are obscured by a large number of putatively artificial exon boundaries arising due to genomic contamination or alignment errors. The current work describes a methodology to associate quality values to the predicted exon-intron boundaries. High quality exon-intron boundaries are used to predict constitutive and alternative splicing ranked by confidence values, aiming to facilitate large-scale analysis of alternative splicing and splicing in general. RESULTS: Applying the current methodology, constitutive splicing is observed in 33,270 EST clusters, out of which 45% are alternatively spliced. The classification derived from the computed confidence values for 17 of these splice events frequently correlate (15/17) with RT-PCR experiments performed for 40 different tissue samples. As an application of the confidence measure, an evaluation of distribution of alternative splicing revealed that majority of variants correspond to the coding regions of the genes. However, still a significant fraction maps to non-coding regions, thereby indicating a functional relevance of alternative splicing in untranslated regions. AVAILABILITY: The predicted alternative splice variants are visualized in the SpliceNest database at http://splicenest.molgen.mpg.de  相似文献   

15.
16.
Piva F  Principato G 《Gene》2007,393(1-2):81-86
There is ample evidence that prediction of human splice sites can be refined by analyzing the nucleotides surrounding splice sites. This could mean that exon nucleotides over splice sites harbour information for the splicing process in addition to the coding information to specify aminoacids. We analyzed the correlations among the nucleotides lying at the end and at the beginning of all the consecutive human exons to seek relationships among the nucleotides. We have divided the sequences taking into account the phase of interruption. Even though exon sequences are involved in the coding function, we found phase-dependent, specific correlations in the area of exon junctions. These regularities do not give rise to specific motifs, but rather to a phase-specific nucleotide context that could contribute to define the splice site or aid the splicing machinery to join the exon ends. Results provide further evidence that accurate selection of human splice sites likely requires the contribution of exon regulatory sequences.  相似文献   

17.
18.
The gene responsible for cystic fibrosis, the most common severe autosomal recessive disorder, is located on the long arm of human chromosome 7, region q31-q32. The gene has recently been identified and shown to be approximately 250 kb in size. To understand the structure and to provide the basis for a systematic analysis of the disease-causing mutations in the gene, genomic DNA clones spanning different regions of the previously reported cDNA were isolated and used to determine the coding regions and sequences of intron/exon boundaries. A total of 22,708 bp of sequence, accounting for approximately 10% of the entire gene, was obtained. Alignment of the genomic DNA sequence with the cDNA sequence showed perfect colinearity between the two and a total of 27 exons, each flanked by consensus splice signals. A number of repetitive elements, including the Alu and Kpn families and simple repeats, such as (GT)17, (GATT)7, and (TA)14, were detected in close vicinity of some of the intron/exon boundaries. At least three of the simple repeats were found to be polymorphic in the population. Although an internal amino acid sequence homology could be detected between the two halves of the predicted polypeptide, especially in the regions of the two putative nucleotide-binding folds (NBF1 and NBF2), the lack of alignment of the nucleotide sequence as well as the different positions of the exon/intron boundaries does not seem to support the hypothesis of a recent gene duplication event. To facilitate detection of mutations by direct sequence analysis of genomic DNA, 28 sets of oligonucleotide primers were designed and tested for their ability to amplify individual exons and the immediately flanking sequences in the introns.  相似文献   

19.
Conserved quartets near 5' intron junctions in primate nuclear pre-mRNA   总被引:2,自引:0,他引:2  
Analysis of a 1000 nucleotide span around 664 primate 5' exon/intron junctions revealed frequent recurrences of G-rich runs downstream of the 5' splice sites. In particular, AGGG, GGGA, GGGG, GGGT and TGGG are frequent at this site. Some C-rich quarters are frequent upstream of the 5' splice site. Similar behaviour of these G- and C-rich quartets is indicated for the 587 rodent introns and for a combined eukaryotic file containing 1688 introns. (A)GGG(A) is also frequent in the introns 60 nucleotides upstream of the 3' splice site, and (A)CCC(A) is frequently found in the exons downstream of the 3' site. The same consistent behaviour of the 3' splice sites is obtained as for the 5' sites, for the primates, rodents and combined eukaryotic file. These results suggest that in addition to the well-conserved 5' and 3' splice sequences, exon as well as intron sequences may play a role in nuclear pre-mRNA splicing.  相似文献   

20.
The oncogene (v-myb) of avian myeloblastosis virus apparently arose by transduction of nucleotide sequences from a cellular gene (c-myb). In c-myb the nucleotide sequences that formed v-myb exist at seven distinct regions separated by nontransduced stretches of sequence that are flanked by eucaryotic splice signals. By contrast, the sequences at the outside boundaries of the transduced region of c-myb do not resemble splice sites. We mapped the nucleotide sequences that are homologous to the ends of v-myb with respect to the exons and introns of c-myb. The results indicate that the leftward recombination between c-myb and the transducing retrovirus occurred within an intron of the cellular gene, whereas the rightward recombination took place in an exon of c-myb. Transduction of c-myb sequences may therefore have involved a DNA rearrangement.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号