首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 156 毫秒
1.
人类基因组盒式外显子和内含子保留的可变剪接位点预测   总被引:2,自引:0,他引:2  
信使RNA的可变剪接是真核生物有别于原核生物的基本特征之一,信使RNA前体的可变剪接极大地丰富了高等真核生物蛋白质的多样性,并与生物体的组织特异性密切相关。文章对人类盒式外显子和内含子保留的一些基本特征进行了统计;根据剪接位点附近的单碱基、碱基二联体和三联体的保守性等特征,利用基于多样性指标的二次判别法,对盒式外显子和内含子保留的供体端和受体端可变剪接位点进行了预测。交叉检验结果表明,盒式外显子供体端和受体端的识别精度分别达到93%、84%以上的水平;内含子保留供体端和受体端的识别精度分别达到89%、81%以上的水平。  相似文献   

2.
采用基于贝叶斯网络的建模方法,预测真核生物DNA序列中的剪接位点.分别建立了供体位点和受体位点模型,并根据两种位点的生物学特性,对模型的拓扑结构和上下游节点的选择进行了优化.通过贝叶斯网络的最大似然学习算法求出模型参数后,利用10分组交互验证方法对测试数据进行剪接位点预测。结果显示,受体位点的平均预测准确率为92.5%,伪受体位点的平均预测准确率为94.0%,供体位点的平均预测准确率为92.3%,伪供体位点的平均预测准确率为93.5%,整体效果要好于基于使用独立和条件概率矩阵、以及隐Markov模型的预测方法.表明利用贝叶斯网络对剪接位点建模是预测剪接位点的一种有效手段.  相似文献   

3.
完整基因结构的预测是当前生命科学研究的一个重要基础课题,其中一个关键环节是剪接位点和各种可变剪接事件的精确识别.基于转录组测序(RNA-seq)数据,识别剪接位点和可变剪接事件是近几年随着新一代测序技术发展起来的新技术策略和方法.本工作基于黑腹果蝇睾丸RNA-seq数据,使用TopHat软件成功识别出39718个果蝇剪接位点,其中有10584个新剪接位点.同时,基于剪接位点的不同组合,针对各类型可变剪接特征开发出计算识别算法,成功识别了8477个可变剪接事件(其中新识别的可变剪接事件3922个),包括可变供体位点、可变受体位点、内含子保留和外显子缺失4种类型.RT-PCR实验验证了2个果蝇基因上新识别的可变剪接事件,发现了全新的剪接异构体.进一步表明,RNA-seq数据可有效应用于识别剪接位点和可变剪接事件,为深入揭示剪接机制及可变剪接生物学功能提供新思路和新手段.  相似文献   

4.
鉴定9个新的RHD基因mRNA可变剪接体   总被引:1,自引:0,他引:1  
许先国  吴俊杰  洪小珍  朱发明  严力行 《遗传》2006,28(10):1213-1218
为了研究各种RHD基因mRNA可变剪接体的基因结构, 应用逆转录聚合酶链反应(RT-PCR)检测正常人脐血样本RHD mRNA, 对RHD cDNA进行TA克隆和序列分析, 对各可变剪接体的剪接位点进行DNA序列分析, 并将RHD mRNA进行表达序列标签(ESTs)分析。结果在28个阳性克隆中, 除全长RHD cDNA外, 共检测到12种(包括9种新的)RHD可变剪接体, 发现外显子遗漏、5′和3′剪接位点变异3种剪接形式, 涉及外显子2~9, 其中6种新的剪接体同时存在RHD和RHCE基因同源杂交现象。ESTs分析还检索到内含子保留形式的剪接体。研究表明, RHD基因mRNA存在复杂的可变剪接机制, 除已报道的剪接体外, 检测到9种新的RHD可变剪接体, 并发现了可变剪接和同源杂交并存现象。  相似文献   

5.
使用估计的反应自由能预测组成性和可变剪接位点   总被引:2,自引:0,他引:2  
基因结构预测中的一个重要步骤是精确地识别剪接位点。基于剪接反应的基本物理原则,最大信息原理被应用到剪接反应的理论分析中,进而导出了反应自由能估计表达式。作为一个简化模型,这个表达式能被用来估计一个5′剪接区或者3′剪接区所参与的剪接反应中的自由能变化。它不但较全面地概括了各个碱基之间的关联,而且还考虑了基因组背景概率的影响。这个反应自由能表达式被用来预测了人类基因中的组成性和可变剪接位点,预测结果是令人满意的,其预测能力比得上当前的一些流行方法。这说明最大信息原理可以作为研究某些核酸-蛋白质相互作用系统(如剪接反应)的理论出发点,导出的反应自由能表达式较好地符合了剪接反应过程。  相似文献   

6.
用神经网络法预测mRNA的剪接位点   总被引:3,自引:2,他引:3  
用神经网络预测了mRNA的剪接位点,比较了在各种不同的情况下,神经网络的学习与预测的情况,讨论了能反映真实剪接位点预测情况的有效预测成功率,指出它可达64%,而且总的预测成功率可达98%.预测的相关系数为0.66.  相似文献   

7.
可变剪接源于多外显子基因生成多个转录本的调控过程。随着高通量测序,尤其是RNA-seq的研究进展,剪接序列和剪接位点可以通过挖掘海量的测序数据进行预测。可变剪接现象拓宽了人们对基因结构和蛋白质亚型的知识。然而现有的短序列比对软件受到随机性比对的影响,产生很多假阳性剪接位点,干扰下游数据分析。本研究发现,可变剪接位点周边序列的结构特征可被深度学习模型提取,并利用深度卷积神经网络识别剪接位点。本研究的模型具有识别率高、计算速度快,模型泛化能力强、鲁棒性高等优势。  相似文献   

8.
该文简要回顾高级糖化终产物受体(RAGE)可变剪接物[全长RAGE、截去N端型RAGE和截去C端型RAGE(esRAEG)1的结构和功能;阐述RAGE可变剪接物由于在内皮细胞中的丰度、表达水平不同,使介导的生物学效应也不同;从而有助于我们了解细胞内RAGE可变剪接物对高级糖化终产物(AGE)反应的分子机制及糖尿病血管并发症易感性的个体差异。  相似文献   

9.
可变剪接是产生蛋白质组多样性和调节基因表达的重要机制,相关研究在高等真核生物中开展较多,而在单细胞真核生物中则较少,尤其是单细胞原生动物纤毛虫中,仅有少量报道。本文基于单细胞模式原生动物嗜热四膜虫种大量转录组数据,对其可变剪接基因进行了鉴定及分析。在嗜热四膜虫中共鉴定到2 894个可变剪接位点,涉及到2 698个可变剪接基因,可分为四类。考虑到转录本拼接的准确性,选择了其中464个与基因组预测模型完全一致的可变剪接基因进行深入分析,其中生长(growth)时期、饥饿(starvation)时期、接合生殖(conjugation)时期特异性的可变剪接基因分别为49个、79个和135个。对可变剪接基因的功能进行分析表明其涉及的功能广泛且显著富集于蛋白激酶过程,提示可变剪接基因在嗜热四膜虫蛋白磷酸化和信号传导中具有重要作用。  相似文献   

10.
可变剪接使一个基因能产生多种m RNA成熟体,极大地增加蛋白多样性.采用中华猕猴桃基因组数据做参考数据,利用中华猕猴桃叶片和果实3个不同发育时期(未成熟、半成熟和成熟期)的转录组数据,从中华猕猴桃基因组(39040个基因)中共鉴定出11651个基因(占总基因数的29%)对应的32180个可变剪接事件.在可变剪接不同类型中,内含子保留类型的发生频率最高,占50%以上;3′可变位点类型频率约为5′端可变类型的2倍.GO富集分析结果表明,可变剪接的基因主要富集于酶调控及核苷酸结合相关功能的GO类别中,而组织特有可变剪接基因功能富集热点与组织的重要功能关联,叶片多为肌动蛋白及微管相关;未成熟果实与双组分信号系统相关;半成熟果实多与磷脂合成过程相关;成熟果实多与信号传递过程相关.另外,55.6%的维生素合成相关基因发生可变剪接事件,显著高于基因组水平的29.6%,暗示着可变剪接参与维生素合成相关基因代谢过程中的重要作用.通过对中华猕猴桃全基因组可变剪接的分析,为解析中华猕猴桃基因组及进一步开展相关分子育种工作提供依据.  相似文献   

11.
Characterization and prediction of alternative splice sites   总被引:9,自引:0,他引:9  
Wang M  Marín A 《Gene》2006,366(2):219-227
Human alternative isoform, cryptic, skipped, and constitutive splice sites from the ALTEXTRON database were analysed regarding splice site strength, composition, GC content, position and binding site strength of polypyrimidine tract and branch site. Several features were identified which distinguish alternative isoform and cryptic splice sites, but not skipped splice sites from constitutive ones. These include splice site strength, introns GC content, U2AF35 binding site score, and oligonucleotide frequencies. For the predictive classification of splice sites, pattern recognition models for different splicing factor binding sites and oligonucleotide frequency models (OFMs) were combined using backpropagation networks. 67.45% of acceptor sites and 71.23% of donor sites are correctly classified by networks trained for classification of constitutive and alternative isoform/cryptic splice sites. A web-application for the prediction of alternative splice sites is available at http://es.embnet.org/~mwang/assp.html .  相似文献   

12.
It has been previously observed that the intrinsically weak variant GC donor sites, in order to be recognized by the U2-type spliceosome, possess strong consensus sequences maximized for base pair formation with U1 and U5/U6 snRNAs. However, variability in signal strength is a fundamental mechanism for splice site selection in alternative splicing. Here we report human alternative GC-AG introns (for the first time from any species), and show that while constitutive GC-AG introns do possess strong signals at their donor sites, a large subset of alternative GC-AG introns possess weak consensus sequences at their donor sites. Surprisingly, this subset of alternative isoforms shows strong consensus at acceptor exon positions 1 and 2. The improved consensus at the acceptor exon can facilitate a strong interaction with U5 snRNA, which tethers the two exons for ligation during the second step of splicing. Further, these isoforms nearly always possess alternative acceptor sites and exhibit particularly weak polypyrimidine tracts characteristic of AG-dependent introns. The acceptor exon nucleotides are part of the consensus required for the U2AF35-mediated recognition of AG in such introns. Such improved consensus at acceptor exons is not found in either normal or alternative GT-AG introns having weak donor sites or weak polypyrimidine tracts. The changes probably reflect mechanisms that allow GC-AG alternative intron isoforms to cope with two conflicting requirements, namely an apparent need for differential splice strength to direct the choice of alternative sites and a need for improved donor signals to compensate for the central mismatch base pair (C-A) in the RNA duplex of U1 snRNA and the pre-mRNA. The other important findings include (i) one in every twenty alternative introns is a GC-AG intron, and (ii) three of every five observed GC-AG introns are alternative isoforms.  相似文献   

13.
The choice of a splice site is not only related to its own intrinsic strength, but also is influenced by its flanking competitors. Splice site competition is an important mechanism for splice site prediction, especially, it is a new insight for alternative splice site prediction. In this paper, the position weight matrix scoring function is used to represent splice site strength, and the mechanism of splice site competition is described by only one parameter: scoring function subtraction. While applying on the alternative splice site prediction, based on the only one parameter, 68.22% of donor sites and 70.86% of acceptor sites are correctly classified into alternative and constitutive. The prediction abilities are approximately equal to the recent method which is based on the mechanism of splice site competition. The results reveal that the scoring function subtraction is the best parameter to describe the mechanism of splice sites competition.  相似文献   

14.
A clean data set of verified splice sites from Homo sapiens are reported as well as the standards used for the clean-up procedure. The sites were validated by: (i) standard cleaning procedures such as requiring consistency in the annotation of the gene structural elements, completeness of the coding regions and elimination of redundant sequences; (ii) clustering by decision trees coupled with analysis of ClustalW alignments of the translated protein sequence with homologous proteins from SWISS-PROT; (iii) matching against human EST sequences. The sites are categorised as: (i) donor sites, a set of 619 EST-confirmed donor sites, for which 138 are either the sites or the regions around the sites involved in alternative splice events; (ii) acceptor sites, a set of 623 EST-confirmed acceptor sites, for which 144 are either the sites or the regions around the sites are involved in alternative splice events; (iii) genuine splice sites, a set of 392 splice sites wherein both the donor and acceptor sites had EST confirmation and were not involved in any alternative splicing; (iv) alternative splice sites, a set of 209 splice sites wherein both the donor and acceptor sites had EST confirmation and the sites or the regions around them were involved in alternative splicing. A set of nucleotide regions that can be used to generate a control set of false splice sites that have a high confidence of being non-functional are also reported.  相似文献   

15.
16.
An information analysis of the 5' (donor) and 3' (acceptor) sequences spanning the ends of nearly 1800 human introns has provided evidence for structural features of splice sites that bear upon spliceosome evolution and function: (1) 82% of the sequence information (i.e. sequence conservation) at donor junctions and 97% of the sequence information at acceptor junctions is confined to the introns, allowing codon choices throughout exons to be largely unrestricted. The distribution of information at intron-exon junctions is also described in detail and compared with footprints. (2) Acceptor sites are found to possess enough information to be located in the transcribed portion of the human genome, whereas donor sites possess about one bit less than the information needed to locate them independently. This difference suggests that acceptor sites are located first in humans and, having been located, reduce by a factor of two the number of alternative sites available as donors. Direct experimental evidence exists to support this conclusion. (3) The sequences of donor and acceptor splice sites exhibit a striking similarity. This suggests that the two junctions derive from a common ancestor and that during evolution the information of both sites shifted onto the intron. If so, the protein and RNA components that are found in contemporary spliceosomes, and which are responsible for recognizing donor and acceptor sequences, should also be related. This conclusion is supported by the common structures found in different parts of the spliceosome.  相似文献   

17.
Defects in the XPG DNA repair endonuclease gene can result in the cancer-prone disorders xeroderma pigmentosum (XP) or the XP-Cockayne syndrome complex. While the XPG cDNA sequence was known, determination of the genomic sequence was required to understand its different functions. In cells from normal donors, we found that the genomic sequence of the human XPG gene spans 30 kb, contains 15 exons that range from 61 to 1074 bp and 14 introns that range from 250 to 5763 bp. Analysis of the splice donor and acceptor sites using an information theory-based approach revealed three splice sites with low information content, which are components of the minor (U12) spliceosome. We identified six alternatively spliced XPG mRNA isoforms in cells from normal donors and from XPG patients: partial deletion of exon 8, partial retention of intron 8, two with alternative exons (in introns 1 and 6) and two that retained complete introns (introns 3 and 9). The amount of alternatively spliced XPG mRNA isoforms varied in different tissues. Most alternative splice donor and acceptor sites had a relatively high information content, but one has the U12 spliceosome sequence. A single nucleotide polymorphism has allele frequencies of 0.74 for 3507G and 0.26 for 3507C in 91 donors. The human XPG gene contains multiple splice sites with low information content in association with multiple alternatively spliced isoforms of XPG mRNA.  相似文献   

18.
Newell CJ  Aziz CE 《Biodegradation》2004,15(6):387-394
The sustainability of biodegradation reactions is of interest at Type 1 chlorinated solvent sites where monitored natural attenuation is being considered as a remedial alternative. Type 1 chlorinated solvent sites are sites undergoing reductive dechlorination where anthropogenic substrates (such as landfill leachate or fermentable organics in the waste materials) ferment to produce hydrogen, a key electron donor. A framework is provided that classifies Type 1 chlorinated solvent sites based on the relative amounts and the depletion rates of the electron donors and the electron acceptors (i.e., chlorinated solvents). Expressions are presented for estimating the total electron donor demand due to the presence of solvents and competing electron acceptors such as dissolved oxygen, nitrate, and sulfate. Finally, a database of 13 chlorinated solvent sites was analyzed to estimate the median and maximum mass discharge rate for dissolved oxygen, nitrate, and sulfate flowing into chlorinated solvent plumes. These values were then used to calculate the amount of hydrogen equivalents and potential for lost perchloroethylene (PCE) biodegradation represented by the inflow of these competing electron acceptors. The median and maximum mass of PCE biodegradation lost due to competing electron acceptors, assuming 100% efficiency, was 226 and 4621 kg year(-1), respectively.  相似文献   

19.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号