首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 128 毫秒
1.
人类基因组盒式外显子和内含子保留的可变剪接位点预测   总被引:2,自引:0,他引:2  
信使RNA的可变剪接是真核生物有别于原核生物的基本特征之一,信使RNA前体的可变剪接极大地丰富了高等真核生物蛋白质的多样性,并与生物体的组织特异性密切相关。文章对人类盒式外显子和内含子保留的一些基本特征进行了统计;根据剪接位点附近的单碱基、碱基二联体和三联体的保守性等特征,利用基于多样性指标的二次判别法,对盒式外显子和内含子保留的供体端和受体端可变剪接位点进行了预测。交叉检验结果表明,盒式外显子供体端和受体端的识别精度分别达到93%、84%以上的水平;内含子保留供体端和受体端的识别精度分别达到89%、81%以上的水平。  相似文献   

2.
人类基因组中可变和组成性剪接位点的预测   总被引:2,自引:0,他引:2  
根据剪接位点的核酸序列保守特征,以及邻近位点的碱基组成和关联特性,结合一对可变剪接位点之间的距离参数和受体端剪接位点前30位碱基的GC和TC含量,利用结合多样性指标的二次判别方法(IDQD),预测了人类基因组中可变和组成性内含子的供体端和受体端的剪接位点,对可变的供体端和受体端剪接位点,阈值ξ选择-2时,总的预测精度分别为87.9%和89.9%,对组成性的供体端和受体端剪接位点,阈值ξ选择-1,总的预测精度分别为92.8%和94.3%.  相似文献   

3.
选择性剪切是调解基因表达的重要机制.识别选择性剪切位点是后基因组时代的一个重要工作.本文从最新的EBI人类基因选择性剪切数据库中,选取5'/3'选择性剪切位点作为正集,选取在剪切位点附近的假剪切位点作为负集,并把所有的选择性剪切位点和假剪切位点随机分成训练集和测试集.本文选用的预测选择性剪切位点的方法是基于位置权重矩阵和离散增量的支持向量机方法.此方法仅基于训练集,以不同位点的单碱基概率和序列片断的三联体频数作为信息参数,利用位置权重矩阵和离散增量算法结合支持向量机,得到了选择性供体位点和受体位点的分类器,并用此分类器对测试集中的选择性供体位点和受体位点进行预测.对独立测试集中的选择性供体位点和选择性受体位点的预测成功率分别为88.74%和90.86%,特异性分别为85.62%和81.19%.本文预测选择性剪切位点的方法成功率高于其它选择性剪切位点预测方法预测成功率,此预测方法进一步提高了对选择性剪切位点的理论预测能力.  相似文献   

4.
A clean data set of verified splice sites from Homo sapiens are reported as well as the standards used for the clean-up procedure. The sites were validated by: (i) standard cleaning procedures such as requiring consistency in the annotation of the gene structural elements, completeness of the coding regions and elimination of redundant sequences; (ii) clustering by decision trees coupled with analysis of ClustalW alignments of the translated protein sequence with homologous proteins from SWISS-PROT; (iii) matching against human EST sequences. The sites are categorised as: (i) donor sites, a set of 619 EST-confirmed donor sites, for which 138 are either the sites or the regions around the sites involved in alternative splice events; (ii) acceptor sites, a set of 623 EST-confirmed acceptor sites, for which 144 are either the sites or the regions around the sites are involved in alternative splice events; (iii) genuine splice sites, a set of 392 splice sites wherein both the donor and acceptor sites had EST confirmation and were not involved in any alternative splicing; (iv) alternative splice sites, a set of 209 splice sites wherein both the donor and acceptor sites had EST confirmation and the sites or the regions around them were involved in alternative splicing. A set of nucleotide regions that can be used to generate a control set of false splice sites that have a high confidence of being non-functional are also reported.  相似文献   

5.
Alternative splicing (AS) constitutes a major mechanism creating protein diversity in humans. Previous bioinformatics studies based on expressed sequence tag and mRNA data have identified many AS events that are conserved between humans and mice. Of these events, ~25% are related to alternative choices of 3′ and 5′ splice sites. Surprisingly, half of all these events involve 3′ splice sites that are exactly 3 nt apart. These tandem 3′ splice sites result from the presence of the NAGNAG motif at the acceptor splice site, recently reported to be widely spread in the human genome. Although the NAGNAG motif is common in human genes, only a small subset of sites with this motif is confirmed to be involved in AS. We examined the NAGNAG motifs and observed specific features such as high sequence conservation of the motif, high conservation of ~30 bp at the intronic regions flanking the 3′ splice site and overabundance of cis-regulatory elements, which are characteristic of alternatively spliced tandem acceptor sites and can distinguish them from the constitutive sites in which the proximal NAG splice site is selected. Our findings imply that AS at tandem splice sites and constitutive splicing of the distal NAG are highly regulated.  相似文献   

6.
7.
Wu Y  Zhang Y  Zhang J 《Genomics》2005,86(3):329-336
Ab initio prediction of functional exon splicing enhancer (ESE) elements based on RNA sequences present a challenge in the evaluation of the functional impacts of human genetic polymorphisms on splicing. To better understand the behavior of ESEs, we studied their distribution in human exons and introns for four known SR protein-binding motifs: SF2/SAF, SC35, SRp40, and SRp55. ESEs are enriched in regions in exons that are close to the splice sites, especially in the region 80 to 120 bases away from the ends of splice acceptor sites. Significant enrichment of ESEs is associated with weak splice acceptor sites but not weak donor sites. ESE density decreases at the 3 ends of long exons. ESEs are also enriched in introns with weak donor or acceptor sites. These characteristics of ESEs may help to predict functional ESE sites in RNA sequences.  相似文献   

8.
基于蛋白质序列组分信息,提出一个离散增量结合二次判别分析法(IDQD)预测蛋白质相互作用的模型,对人类蛋白质相互作用进行预测.自洽检验的识别精度达到75.89%,3-fold交叉检验的敏感性和特异性分别为64.22%和64.68%.结果表明IDQD算法可以用于蛋白质相互作用的预测.  相似文献   

9.
10.
Prediction of human mRNA donor and acceptor sites from the DNA sequence   总被引:40,自引:0,他引:40  
Artificial neural networks have been applied to the prediction of splice site location in human pre-mRNA. A joint prediction scheme where prediction of transition regions between introns and exons regulates a cutoff level for splice site assignment was able to predict splice site locations with confidence levels far better than previously reported in the literature. The problem of predicting donor and acceptor sites in human genes is hampered by the presence of numerous amounts of false positives: here, the distribution of these false splice sites is examined and linked to a possible scenario for the splicing mechanism in vivo. When the presented method detects 95% of the true donor and acceptor sites, it makes less than 0.1% false donor site assignments and less than 0.4% false acceptor site assignments. For the large data set used in this study, this means that on average there are one and a half false donor sites per true donor site and six false acceptor sites per true acceptor site. With the joint assignment method, more than a fifth of the true donor sites and around one fourth of the true acceptor sites could be detected without accompaniment of any false positive predictions. Highly confident splice sites could not be isolated with a widely used weight matrix method or by separate splice site networks. A complementary relation between the confidence levels of the coding/non-coding and the separate splice site networks was observed, with many weak splice sites having sharp transitions in the coding/non-coding signal and many stronger splice sites having more ill-defined transitions between coding and non-coding.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号