首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
人类基因组中可变和组成性剪接位点的预测   总被引:2,自引:0,他引:2  
根据剪接位点的核酸序列保守特征,以及邻近位点的碱基组成和关联特性,结合一对可变剪接位点之间的距离参数和受体端剪接位点前30位碱基的GC和TC含量,利用结合多样性指标的二次判别方法(IDQD),预测了人类基因组中可变和组成性内含子的供体端和受体端的剪接位点,对可变的供体端和受体端剪接位点,阈值ξ选择-2时,总的预测精度分别为87.9%和89.9%,对组成性的供体端和受体端剪接位点,阈值ξ选择-1,总的预测精度分别为92.8%和94.3%.  相似文献   

2.
鸡Z染色体上DMRT1基因的多重跨染色体剪接   总被引:1,自引:0,他引:1  
性别决定与分化发育是同时涉及生命现象中两种细胞分裂(有丝分裂和减数分裂)形式的惟一的分化发育过程。对该过程中关键基因DMRT1的转录分析,发现位于鸡Z染色体上的DMRT1基因分别同时与4号染色体上的CENPC1基因、5号染色体上CD5R基因和2号染色体上37LRP/p40基因发生跨染色体剪接,由此构成了新的跨染色体剪接本DMRT1-CENPC1、DMRT1-CD5R和DMRT1-37LRP/p40。对其剪接位点的分析,发现两段染色体序列存在的重叠区可能在这种剪接中起着重要作用。DMRT1基因在转录过程中同时与多个染色体上基因发生多次跨染色体剪接的发现,无疑有助于对在转录水平上的多样性基因调控以及性别决定与分化发育等的认识开辟另一新途径。  相似文献   

3.
使用估计的反应自由能预测组成性和可变剪接位点   总被引:2,自引:0,他引:2  
基因结构预测中的一个重要步骤是精确地识别剪接位点。基于剪接反应的基本物理原则,最大信息原理被应用到剪接反应的理论分析中,进而导出了反应自由能估计表达式。作为一个简化模型,这个表达式能被用来估计一个5′剪接区或者3′剪接区所参与的剪接反应中的自由能变化。它不但较全面地概括了各个碱基之间的关联,而且还考虑了基因组背景概率的影响。这个反应自由能表达式被用来预测了人类基因中的组成性和可变剪接位点,预测结果是令人满意的,其预测能力比得上当前的一些流行方法。这说明最大信息原理可以作为研究某些核酸-蛋白质相互作用系统(如剪接反应)的理论出发点,导出的反应自由能表达式较好地符合了剪接反应过程。  相似文献   

4.
人类基因组盒式外显子和内含子保留的可变剪接位点预测   总被引:2,自引:0,他引:2  
信使RNA的可变剪接是真核生物有别于原核生物的基本特征之一,信使RNA前体的可变剪接极大地丰富了高等真核生物蛋白质的多样性,并与生物体的组织特异性密切相关。文章对人类盒式外显子和内含子保留的一些基本特征进行了统计;根据剪接位点附近的单碱基、碱基二联体和三联体的保守性等特征,利用基于多样性指标的二次判别法,对盒式外显子和内含子保留的供体端和受体端可变剪接位点进行了预测。交叉检验结果表明,盒式外显子供体端和受体端的识别精度分别达到93%、84%以上的水平;内含子保留供体端和受体端的识别精度分别达到89%、81%以上的水平。  相似文献   

5.
为提高非翻译区剪接位点识别的精度,提出一种统计概率与支持向量机相结合的识别方法 .该方法主要分为两个阶段,第一阶段应用统计学方法对非翻译区(UTR)序列进行描述,将序列中各碱基之间的相关性、位置特异性、保守性等特征用概率形式描述,以概率参数作为第二阶段支持向量机的输入向量,第二阶段应用带有多项式核函数的支持向量机(SVM)对剪接位点进行识别.通过对人类5′UTR剪接位点数据集进行测试,结果表明:该方法对非翻译区剪接位点的识别取得了很好的效果.  相似文献   

6.
为了揭示转座子对旧世界猴基因组多样性和进化的影响,基于Repbase数据库和RepeatMasker比较了4种旧世界猴——东非狒狒Papio anubis、猕猴Macaca mulatta、绿猴Chlorocebus sabaeus和长鼻猴Nasalis larvatus基因组中转座子的特征,重点关注Alu家族的组成、插入/缺失多态性Alu位点和物种特有Alu插入。结果显示,4个基因组中短散在重复序列拷贝数最丰富,平均分歧率最小,其中灵长目Primates特有的Alu逆转座子的拷贝数超过了100万个(长鼻猴除外)。4个基因组中转座子的基本组成和分布与它们的进化关系相吻合,东非狒狒和猕猴的转座子特征相似,二者与绿猴、长鼻猴有较大差异。通过基因组的两两比较,鉴定了大量在4个基因组间具有插入/缺失多态性的Alu位点,以及7 882个各物种特有的Alu位点,95%以上的物种特有位点都属于AluY家族。研究结果揭示了Alu的转座活动对于旧世界猴动物基因组的进化及多样性具有重要影响。  相似文献   

7.
在基因表达调控中,长度在200~500bp之间的短CpG岛具有非常重要的作用,然而目前并没有一种非常好的方法寻找短CpG岛。基于给定长度DNA片段上碱基随机分布的排列组合算法,我们定义了一种计算CpG观察预期比的新方法。结合DNA片段长度和GC含量这两个参数,该方法给出了人类21号和22号染色体上CpG岛分布的预测结果。根据CpG岛与基因功能区、Alu重复序列和UCSC的CpG岛对比分析,本研究给出了新的CpG岛判断准则:(1)CpG岛不小于200bp;(2)GC占比不小于50%;(3)CpG观察预期比不小于1.4。通过与Takai方法的对比分析显示,新方法能够显著地排除Alu重复序列对CpG岛预测的影响,并且能够准确预测具有更短长度的CpG岛在DNA片段上的分布。多基因转录起始位点基因分析结果表明,短CpG岛是UCSC的CpG岛的核心组成部分,短CpG岛是参与基因表达调控的核心元件。本研究为预测和分析短CpG岛在人类基因调控中的作用提供了必要的手段。  相似文献   

8.
本文针对人类剪接位点数据库研究,发现剪接受体位点左侧内含子30碱基内存在明显特征信息.利用李雅普诺夫定理建立识别新方法,识别准确率80%,方法简单易行,这些对基因识别和内含子功能研究有重要的参考依据.  相似文献   

9.
性别决定与分化发育是同时涉及生命现象中两种细胞分裂(有丝分裂和减数分裂)形式的惟一的分化发育过程。对该过程中关键基因DMRT1的转录分析,发现位于鸡Z染色体上的DMRT1基因分别同时与4号染色体上的CENP C1基因、5号染色体上CD5R基因和2号染色体上37LRP/p40基因发生跨染色体剪接,由此构成了新的跨染色体剪接本DMRT1-CENP C1DMRT1-CD5RDMRT1-37LRP/p40。对其剪接位点的分析,发现两段染色体序列存在的重叠区可能在这种剪接中起着重要作用。DMRT1基因在转录过程中同时与多个染色体上基因发生多次跨染色体剪接的发现,无疑有助于对在转录水平上的多样性基因调控以及性别决定与分化发育等的认识开辟另一新途径。  相似文献   

10.
目的探讨Alu序列甲基化与乳腺癌转移潜能的关系。方法用亚硫酸氢盐修饰联合限制性内切酶分析法(combined bisulfite restriction analysis,COBRA)、亚硫酸氢盐修饰结合直接测序法(bisulfite sequencing,BSP)检测两株转移能力不同的乳腺癌细胞系MCF7和MDA—MB-435S中Alu甲基化状态,每个样品挑取10个克隆测序。结果MCF7和MDA—MB-435S中Alu甲基化水平均明显低于报道的正常人体细胞Alu甲基化水平,但MCF7中Alu的甲基化水平明显高于MDA-MB-435S。同时,Alu甲基化位点在基因组中分布不均匀。结论乳腺癌的转移潜能可能与Alu序列的去甲基化以及去甲基化位点的分布相关,值得进一步探讨。  相似文献   

11.
A clean data set of verified splice sites from Homo sapiens are reported as well as the standards used for the clean-up procedure. The sites were validated by: (i) standard cleaning procedures such as requiring consistency in the annotation of the gene structural elements, completeness of the coding regions and elimination of redundant sequences; (ii) clustering by decision trees coupled with analysis of ClustalW alignments of the translated protein sequence with homologous proteins from SWISS-PROT; (iii) matching against human EST sequences. The sites are categorised as: (i) donor sites, a set of 619 EST-confirmed donor sites, for which 138 are either the sites or the regions around the sites involved in alternative splice events; (ii) acceptor sites, a set of 623 EST-confirmed acceptor sites, for which 144 are either the sites or the regions around the sites are involved in alternative splice events; (iii) genuine splice sites, a set of 392 splice sites wherein both the donor and acceptor sites had EST confirmation and were not involved in any alternative splicing; (iv) alternative splice sites, a set of 209 splice sites wherein both the donor and acceptor sites had EST confirmation and the sites or the regions around them were involved in alternative splicing. A set of nucleotide regions that can be used to generate a control set of false splice sites that have a high confidence of being non-functional are also reported.  相似文献   

12.
Intron definition in splicing of small Drosophila introns.   总被引:4,自引:1,他引:3       下载免费PDF全文
Approximately half of the introns in Drosophila melanogaster are too small to function in a vertebrate and often lack the pyrimidine tract associated with vertebrate 3' splice sites. Here, we report the splicing and spliceosome assembly properties of two such introns: one with a pyrimidine-poor 3' splice site and one with a pyrimidine-rich 3' splice site. The pyrimidine-poor intron was absolutely dependent on its small size for in vivo and in vitro splicing and assembly. As such, it had properties reminiscent of those of yeast introns. The pyrimidine-rich intron had properties intermediate between those of yeasts and vertebrates. This 3' splice site directed assembly of ATP-dependent complexes when present as either an intron or exon and supported low levels of in vivo splicing of a moderate-length intron. We propose that splice sites can be recognized as pairs across either exons or introns, depending on which distance is shorter, and that a pyrimidine-rich region upstream of the 3' splice site facilitates the exon mode.  相似文献   

13.
Alternative splicing (AS) constitutes a major mechanism creating protein diversity in humans. Previous bioinformatics studies based on expressed sequence tag and mRNA data have identified many AS events that are conserved between humans and mice. Of these events, ~25% are related to alternative choices of 3′ and 5′ splice sites. Surprisingly, half of all these events involve 3′ splice sites that are exactly 3 nt apart. These tandem 3′ splice sites result from the presence of the NAGNAG motif at the acceptor splice site, recently reported to be widely spread in the human genome. Although the NAGNAG motif is common in human genes, only a small subset of sites with this motif is confirmed to be involved in AS. We examined the NAGNAG motifs and observed specific features such as high sequence conservation of the motif, high conservation of ~30 bp at the intronic regions flanking the 3′ splice site and overabundance of cis-regulatory elements, which are characteristic of alternatively spliced tandem acceptor sites and can distinguish them from the constitutive sites in which the proximal NAG splice site is selected. Our findings imply that AS at tandem splice sites and constitutive splicing of the distal NAG are highly regulated.  相似文献   

14.
The performance of computational tools that can predict human splice sites are reviewed using a test set of EST-confirmed splice sites. The programs (namely HMMgene, NetGene2, HSPL, NNSPLICE, SpliceView and GeneID-3) differ from one another in the degree of discriminatory information used for prediction. The results indicate that, as expected, HMMgene and NetGene2 (which use global as well as local coding information and splice signals) followed by HSPL (which uses local coding information and splice signals) performed better than the other three programs (which use only splice signals). For the former three programs, one in every three false positive splice sites was predicted in the vicinity of true splice sites while only one in every 12 was expected to occur in such a region by chance. The persistence of this observation for programs (namely FEXH, GRAIL2, MZEF, GeneID-3, HMMgene and GENSCAN) that can predict all the potential exons (including optimal and sub-optimal) was assessed. In a high proportion (>50%) of the partially correct predicted exons, the incorrect exon ends were located in the vicinity of the real splice sites. Analysis of the distribution of proximal false positives indicated that the splice signals used by the algorithms are not strong enough to discriminate particularly those false predictions that occur within ± 25 nt around the real sites. It is therefore suggested that specialised statistics that can discriminate real splice sites from proximal false positives be incorporated in gene prediction programs.  相似文献   

15.
16.
How did alternative splicing evolve?   总被引:15,自引:0,他引:15  
  相似文献   

17.
The molecular basis of the skipping of constitutive exons in many messenger RNAs is not fully understood. A well-studied example is exon 9 of the human cystic fibrosis transmembrane conductance regulator gene (CFTR), in which an abbreviated polypyrimidine tract between the branch point A and the 3' splice site is associated with increased exon skipping and disease. However, many exons, both in CFTR and in other genes and have short polypyrimidine tracts in their 3' splice sites, yet they are not skipped. Inspection of the 5' splice sites immediately up- and downstream of exon 9 revealed deviations from consensus sequence, so we hypothesized that this exon may be inherently vulnerable to skipping. To test this idea, we constructed a CFTR minigene and replicated exon 9 skipping associated with the length of the polypyrimidine tract upstream of exon 9. We then mutated the flanking 5' splice sites and determined the effect on exon skipping. Conversion of the upstream 5' splice site to consensus by replacing a pyrimidine at position +3 with a purine resulted in increased exon skipping. In contrast, conversion of the downstream 5' splice site to consensus by insertion of an adenine at position +4 resulted in a substantial reduction in exon 9 skipping, regardless of whether the upstream 5' splice site was consensus or not. These results suggested that the native downstream 5' splice site plays an important role in CFTR exon 9 skipping, a hypothesis that was supported by data from sheep and mouse genomes. Although CFTR exon 9 in sheep is preceded by a long polypyrimidine tract (Y(14)), it skips exon 9 in vivo and has a nonconsensus downstream 5' splice site identical to that in humans. On the other hand, CFTR exon 9 in mice is preceded by a short polypyrimidine tract (Y(5)) but is not skipped in vivo. Its downstream 5' splice site differs from that in humans by a 2-nt insertion, which, when introduced into the human CFTR minigene, abolished exon 9 skipping. Taken together, these observations place renewed emphasis on deviations at 5' splice sites in nucleotides other than the invariant GT, particularly when such changes are found in conjunction with other altered splicing sequences, such as a shortened polypyrimidine tract. Thus, careful inspection of entire 5' splice sites may identify constitutive exons that are vulnerable to skipping.  相似文献   

18.
We have collected over half a million splice sites from five species-Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans and Arabidopsis thaliana-and classified them into four subtypes: U2-type GT-AG and GC-AG and U12-type GT-AG and AT-AC. We have also found new examples of rare splice-site categories, such as U12-type introns without canonical borders, and U2-dependent AT-AC introns. The splice-site sequences and several tools to explore them are available on a public website (SpliceRack). For the U12-type introns, we find several features conserved across species, as well as a clustering of these introns on genes. Using the information content of the splice-site motifs, and the phylogenetic distance between them, we identify: (i) a higher degree of conservation in the exonic portion of the U2-type splice sites in more complex organisms; (ii) conservation of exonic nucleotides for U12-type splice sites; (iii) divergent evolution of C.elegans 3' splice sites (3'ss) and (iv) distinct evolutionary histories of 5' and 3'ss. Our study proves that the identification of broad patterns in naturally-occurring splice sites, through the analysis of genomic datasets, provides mechanistic and evolutionary insights into pre-mRNA splicing.  相似文献   

19.
Prediction of exact boundaries of exons   总被引:3,自引:0,他引:3  
It is known that while the programs used to predict genes are good at determining coding nucleotides, there are considerable inaccuracies in the determination of the gene structural elements. Among them, the most notable is that of the exact boundaries of exons. In order to assess this, we had earlier reviewed various programs that predict potential splice sites and exons. The results led to the following two observations: (i) a high proportion of false positive splice sites from computational predictions occur in the vicinity of real splice sites; and (ii) current algorithms are misled to predict wrong splice sites more often when the coding potential ends within +/-25 nucleotides from real sites than when it ends at farther positions. In this report, we review decision tree models for human splice sites and the resultant software tool, namely SpliceProximalCheck, that discriminates such'proximal' false positives from real splice sites. Further presented is an integrated system (MZEF-SPC) with Splice ProximalCheck (SPC) as a front-end tool operating on the results of Michael Zhang's exon finder program. Examination of the output of the integrated program on an illustrative gene set revealed that as much as 61 of 93 MZEF-predicted false positive exons could be eliminated by SPC for a loss of only 3 out of 33 MZEF-predicted true positive exons.  相似文献   

20.
Circularly permuted group I intron precursor RNAs, containing end-to-end fused exons which interrupt half-intron sequences, were generated and tested for self-splicing activity. An autocatalytic RNA can form when the primary order of essential intron sequence elements, splice sites, and exons are permuted in this manner. Covalent attachment of guanosine to the 5' half-intron product, and accurate exon ligation indicated that the mechanism and specificity of splicing were not altered. However, because the exons were fused and the order of the splice sites reversed, splicing released the fused-exon as a circle. With this arrangement of splice sites, circular exon production was a prediction of the group I splicing mechanism. Circular RNAs have properties that would make them attractive for certain studies of RNA structure and function. Reversal of splice site sequences in a context that allows splicing, such as those generated by circularly permuted group I introns, could be used to generate short defined sequences of circular RNA in vitro and perhaps in vivo.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号