首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 171 毫秒
1.
完整基因结构的预测是当前生命科学研究的一个重要基础课题,其中一个关键环节是剪接位点和各种可变剪接事件的精确识别.基于转录组测序(RNA-seq)数据,识别剪接位点和可变剪接事件是近几年随着新一代测序技术发展起来的新技术策略和方法.本工作基于黑腹果蝇睾丸RNA-seq数据,使用TopHat软件成功识别出39718个果蝇剪接位点,其中有10584个新剪接位点.同时,基于剪接位点的不同组合,针对各类型可变剪接特征开发出计算识别算法,成功识别了8477个可变剪接事件(其中新识别的可变剪接事件3922个),包括可变供体位点、可变受体位点、内含子保留和外显子缺失4种类型.RT-PCR实验验证了2个果蝇基因上新识别的可变剪接事件,发现了全新的剪接异构体.进一步表明,RNA-seq数据可有效应用于识别剪接位点和可变剪接事件,为深入揭示剪接机制及可变剪接生物学功能提供新思路和新手段.  相似文献   

2.
Jin HY  Luo LF  Zhang LR 《Gene》2008,424(1-2):115-120
A crucial part in the gene structure prediction is to identify the accurate splice sites, not only constitutive but also alternative ones. Here, we use the maximum information principle (MIP) to analyze the conservative segments around splice sites. According to the MIP, a reaction free energy (RFE) expression is deduced, which can be employed to estimate the free energy change during splicing reaction involving a donor or acceptor site. The expression contains not only the background probability factors, but also all kinds of dependencies among both adjacent and non-adjacent bases. We apply the RFE expression to recognize splice sites and their flanking competitors in human genes, the results show high sensitivity and specificity, so the RFE expression accords well with the splicing reaction process. Moreover, the RFE expression is better than previous methods for predicting competitors of splice sites, and it outperforms the reaction free energy subtraction (RFES), that implies RFE competition between a given splice site and its flanking competitor may not be an only primary factor for alternative splice site selection. The work is helpful to not only the understanding of splicing reaction from its relation to MIP, but also the research on computational recognition of splicing sites and alternative splice events.  相似文献   

3.
4.
Xia H  Bi J  Li Y 《Nucleic acids research》2006,34(21):6305-6313
Alternative splicing plays an important role in regulating gene expression. Currently, most efficient methods use expressed sequence tags or microarray analysis for large-scale detection of alternative splicing. However, it is difficult to detect all alternative splice events with them because of their inherent limitations. Previous computational methods for alternative splicing prediction could only predict particular kinds of alternative splice events. Thus, it would be highly desirable to predict alternative 5'/3' splice sites with various splicing levels using genomic sequences alone. Here, we introduce the competition mechanism of splice sites selection into alternative splice site prediction. This approach allows us to predict not only rarely used but also frequently used alternative splice sites. On a dataset extracted from the AltSplice database, our method correctly classified approximately 70% of the splice sites into alternative and constitutive, as well as approximately 80% of the locations of real competitors for alternative splice sites. It outperforms a method which only considers features extracted from the splice sites themselves. Furthermore, this approach can also predict the changes in activation level arising from mutations in flanking cryptic splice sites of a given splice site. Our approach might be useful for studying alternative splicing in both computational and molecular biology.  相似文献   

5.
Characterization and prediction of alternative splice sites   总被引:9,自引:0,他引:9  
Wang M  Marín A 《Gene》2006,366(2):219-227
Human alternative isoform, cryptic, skipped, and constitutive splice sites from the ALTEXTRON database were analysed regarding splice site strength, composition, GC content, position and binding site strength of polypyrimidine tract and branch site. Several features were identified which distinguish alternative isoform and cryptic splice sites, but not skipped splice sites from constitutive ones. These include splice site strength, introns GC content, U2AF35 binding site score, and oligonucleotide frequencies. For the predictive classification of splice sites, pattern recognition models for different splicing factor binding sites and oligonucleotide frequency models (OFMs) were combined using backpropagation networks. 67.45% of acceptor sites and 71.23% of donor sites are correctly classified by networks trained for classification of constitutive and alternative isoform/cryptic splice sites. A web-application for the prediction of alternative splice sites is available at http://es.embnet.org/~mwang/assp.html .  相似文献   

6.
7.
Many alternative splice events result in subtle mRNA changes, and most of them occur at short-distance tandem donor and acceptor sites. The splicing mechanism of such tandem sites likely involves the stochastic selection of either splice site. While tandem splice events are frequent, it is unknown how many are functionally important. Here, we use phylogenetic conservation to address this question, focusing on tandems with a distance of 3-9 nucleotides. We show that previous contradicting results on whether alternative or constitutive tandem motifs are more conserved between species can be explained by a statistical paradox (Simpson's paradox). Applying methods that take biases into account, we found higher conservation of alternative tandems in mouse, dog, and even chicken, zebrafish, and Fugu genomes. We estimated a lower bound for the number of alternative sites that are under purifying (negative) selection. While the absolute number of conserved tandem motifs decreases with the evolutionary distance, the fraction under selection increases. Interestingly, a number of frameshifting tandems are under selection, suggesting a role in regulating mRNA and protein levels via nonsense-mediated decay (NMD). An analysis of the intronic flanks shows that purifying selection also acts on the intronic sequence. We propose that stochastic splice site selection can be an advantageous mechanism that allows constant splice variant ratios in situations where a deviation in this ratio is deleterious.  相似文献   

8.
基于机器学习的高精度剪接位点识别是真核生物基因组注释的关键.本文采用卡方测验确定序列窗口长度,构建卡方统计差表提取位置特征,并结合碱基二联体频次表征序列;针对剪接位点正负样本高度不均衡这一情形,构建10个正负样本均衡的支持向量机分类器,进行加权投票决策,有效解决了不平衡模式分类问题. HS~3D数据集上的独立测试结果显示,供体、受体位点预测准确率分别达到93.39%、90.46%,明显高于参比方法.基于卡方统计差表的位置特征能有效表征DNA序列,在分子序列信号位点识别中具有应用前景.  相似文献   

9.
人类基因组盒式外显子和内含子保留的可变剪接位点预测   总被引:2,自引:0,他引:2  
信使RNA的可变剪接是真核生物有别于原核生物的基本特征之一,信使RNA前体的可变剪接极大地丰富了高等真核生物蛋白质的多样性,并与生物体的组织特异性密切相关。文章对人类盒式外显子和内含子保留的一些基本特征进行了统计;根据剪接位点附近的单碱基、碱基二联体和三联体的保守性等特征,利用基于多样性指标的二次判别法,对盒式外显子和内含子保留的供体端和受体端可变剪接位点进行了预测。交叉检验结果表明,盒式外显子供体端和受体端的识别精度分别达到93%、84%以上的水平;内含子保留供体端和受体端的识别精度分别达到89%、81%以上的水平。  相似文献   

10.
We present here a new algorithm for functional site analysis. It is based on four main assumptions: each variation of nucleotide composition makes a different contribution to the overall binding free energy of interaction between a functional site and another molecule; nonfunctioning site-like regions (pseudosites) are absent or rare in genomes; there may be errors in the sample of sites; and nucleotides of different site positions are considered to be mutually dependent. In this algorithm, the site set is divided into subsets, each described by a certain consensus. Donor splice sites of the human protein-coding genes were analyzed. Comparing the results with other methods of donor splice site prediction has demonstrated a more accurate prediction of consensus sequences AG/GU(A,G), G/GUnAG, /GU(A,G)AG, /GU(A,G)nGU, and G/GUA than is achieved by weight matrix and consensus (A,C)AG/GU(A,G)AGU with mismatches. The probability of the first type error, E1, for the obtained consensus set was about 0.05, and the probability of the second type error, E2, was 0.15. The analysis demonstrated that accuracy of the functional site prediction could be improved if one takes into account correlations between the site positions. The accuracy of prediction by using human consensus sequences was tested on sequences from different organisms. Some differences in consensus sequences for the plant Arabidopsis sp., the invertebrate Caenorhabditis sp., and the fungus Aspergillus sp. were revealed. For the yeast Saccharomyces sp. only one conservative consensus, /GUA(U,A,C)G(U,A,C), was revealed (E1 = 0.03, E2 = 0.03). Yeast is a very interesting model to use for analysis of molecular mechanisms of splicing. Received: 14 October 1996 / Accepted: 30 January 1997  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号