首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 78 毫秒
1.
基因预测是指预测DNA序列中编码蛋白质的部分。随着多数生物基因组的测序工作的完成 ,基因预测更显得尤为重要。基因预测主要包括两种方法 ,首先是同源方法 ,也称为“外在方法” ,其次是基因预测方法或称为“内在方法”。主要对隐马尔可夫模型、傅立叶变换、动态规划等几种“外在方法”进行介绍。  相似文献   

2.
广义隐Markov模型(GHMM)是基因识别的一种重要模型,但是其计算量比传统的隐Markov模型大得多,以至于不能直 接在基因识别中使用。根据原核生物基因的结构特点,提出了一种高效的简化算法,其计算量是序列长度的线性函数。在此 基础上,构建了针对原核生物基因的识别程序GeneMiner,对实际数据的测试表明,此算法是有效的。  相似文献   

3.
利用上下文相关隐Markov模型建立ncRNA基因基本二级结构模型,从两序列比对结果中提取序列相似性信息,并把该信息引入基本的上下文相关Markov模型中,从而构建出新的ncRNA基因识别模型。  相似文献   

4.
离子通道是细胞膜里的大分子孔道,是跨越细胞膜里的蛋白质大分子,是神经、肌肉等细胞膜兴奋性的基础.人体细胞均具有完成特殊功能的离子通道,构建离子通道,尤其其门控行为的动力学模型,对于研究离子通道的相关课题具有重要意义.离子通道的开关反映了蛋白质构象变化的动力学过程.本文介绍了细胞膜离子通道的基本内容和几种常用模型,并根据Markov链对离子通道门控行为的一个二态(闭、开)模型的Markov过程进行了改进,得到了包含失活状态的离子通道门控行为的Markov模型.  相似文献   

5.
李芳芳  彭仕政  王效华 《生物磁学》2009,(21):4130-4132
离子通道是细胞膜里的大分子孔道,是跨越细胞膜里的蛋白质大分子,是神经、肌肉等细胞膜兴奋性的基础。人体细胞均具有完成特殊功能的离子通道,构建离子通道,尤其其门控行为的动力学模型,对于研究离子通道的相关课题具有重要意义。离子通道的开关反映了蛋白质构象变化的动力学过程。本文介绍了细胞膜离子通道的基本内容和几种常用模型,并根据Markov链对离子通道门控行为的一个二态(闭、开)模型的Markov过程进行了改进,得到了包含失活状态的离子通道门控行为的Markov模型。  相似文献   

6.
预测转录单位基础上的原核生物启动子预测   总被引:8,自引:0,他引:8  
启动子及转录单位预测对于了解基因间的功能及相互间的调节关系具有重要的作用 ,这方面的研究一直是生物信息学的一个重要方向 ,但预测的准确率一直都很有限。本文建立了在转录单位预测基础上进行原核生物启动子预测的新方法 ,首先根据基因间距离、功能关系及多基因比对结果来进行转录单位预测 ,得到了比较理想的结果 ,而且对于研究得比较透和研究得较少的基因组都适用。其后在转录单位预测结果基础上进行启动子预测则采用了隐Markov链模型 ,并在Markov链中考虑状态驻留时间。结果显示 ,该方法能有效地预测出启动子序列及其位置 ,准确率达到 70 %以上。  相似文献   

7.
隐马尔可夫模型-改进的预测蛋白质二级结构方法   总被引:1,自引:0,他引:1  
引入蛋白质二级结构预测的新方法:隐马尔可夫模型,其中将蛋白质的二级结构分成三类:H(指α-螺旋),E(β-折叠)及O(包括转角,卷曲及其结构).该方法属于统计方法,但考虑了相邻氮基酸之间的相互作用(体现在状态传输概率).通过模型的改进及参数的确定后,我们编制了程序HMMPS.用它来预测蛋白质二级结构,具有很高的准确度.其中关于H,F和O的准确率分别达到80.1%.72.0%和63.2%这表明.我们的方法是较为可靠的。  相似文献   

8.
CLIMEX:预测物种分布区的软件   总被引:20,自引:2,他引:20  
宋红敏  张清芬  韩雪梅  徐岩  徐汝梅 《昆虫知识》2004,41(4):379-386,F003
CLIMEX是通过物种已知地理分布区域的气候参数来预测物种潜在分布区的软件。 1 999年发布了最新版即CLIMEXforWindows 1 1。CLIMEX有 2个基本假设 :( 1 )物种在 1年内经历 2个时期 ,即适合种群增长时期和不适合以至于危及生存的时期 ;( 2 )气候是影响物种分布的主要因素 ,并利用增长指数、胁迫指数和限制条件 (滞育和有效积温 )描述物种对气候的不同反应 ,这 2组参数构成生态气候指数 ,作为全面描述物种在某地区和年份适合度的指标。模型预测结果以表、图和地图输出。CLIMEX可以用于检疫、生物防治、有害生物风险分析、害虫管理和流行病的预测等。目前已经用于几十种有害生物的适生性研究。该文通过拟和松墨天牛在中国的分布区为例说明CLIMEX的用法 ,并根据松墨天牛在亚洲东部的气候条件 ,预测其在全球的潜在适生区 ,为动植物检疫部门及时采取相应措施控制松材线虫的进一步扩散提供科学依据  相似文献   

9.
基于隐马氏模型对编码序列缺失与插入的检测(英)   总被引:2,自引:0,他引:2       下载免费PDF全文
在基因组测序工作完成后,利用计算工具进行基因识别以及基因结构预测受到了越来越多人的重视.人们开发了大量的相关应用软件,如GenScan, Genemark, GRAIL等,这些软件在寻找新基因方面提供了很重要的线索.但基因的识别和预测问题仍未得到完全解决,当目标基因的编码序列有缺失和插入时,其预测结果和基因的实际结构相差很大.为了消除测序错误对预测结果的影响,希望能找出编码序列区的测序错误.基于这种想法,尝试根据DNA序列的一些统计特性,利用隐马尔科夫模型(Hidden Markov Model),引入缺失和插入状态,然后用Viterbi算法,从中找出含有缺失和插入的外显子序列片段.在常用的Burset/Guigo检测集进行检测,得到的结果在外显子水平上,Sn(sensitivity)和Sp(specificity)均达到84%以上.  相似文献   

10.
文峪河上游河岸林的演替分析与预测   总被引:6,自引:1,他引:6       下载免费PDF全文
高润梅  郭晋平 《生态学报》2010,30(6):1564-1572
以文峪河上游河岸林为研究对象,利用空间代替时间的方法,通过静态演替分析法结合Markov模型对群落的演替趋势和过程进行了研究。根据群落顶极适应值多重比较结果,结合优势树种的生物学和生态学特性,将13个群落区分成4个演替阶段,群落演替梯度分析结果与之吻合,处于同一演替阶段及相邻演替阶段的群落相似性较高,综合上述结果,构建本区河岸林群落的演替系列为:Ⅰ阔叶林阶段(群落PCS和CPM)→Ⅱ阔针混交林阶段(群落CPP、CMM和CPW)→Ⅲ针阔混交林阶段(群落PRL、PCP、MCP、PRM和PRW)→Ⅳ针叶林阶段(群落PWS);青杨辽东栎混交林(PCL)和油松白桦混交林(TPM)与其它群落相关性不强,属于低山森林演替系列。Markov模型预测杨桦落叶松混交林(CPP)和杨桦云杉混交林(CMM、CPW)的演替方向为云杉林,进一步验证了所构建群落演替系列的正确性,同时细化了群落的演替过程。  相似文献   

11.
The gene-finding programs developed so far have not paid muchattention to the detection of short protein coding regions (CDSs).However, the detection of short CDSs is important for the studyof photosynthesis. We utilized GeneHacker, a gene-finding programbased on the hidden Markov model (HMM), to detect short CDSs(from 90 to 300 bases) in a 1.0 mega contiguous sequence ofcyanobacterium Synechocystis sp. strain PCC6803 which carriesa complete set of genes for oxygenic photosynthesis. GeneHackerdiffers from other gene-finding programs based on the HMM inthat it utilizes di-codon statistics as well. GeneHacker successfullydetected seven out of the eight short CDSs annotated in thissequence and was clearly superior to GeneMark in this rangeof length. GeneHacker detected 94 potentially new CDSs, 9 ofwhich have counterparts in the genetic databases. Four of thenine CDSs were less than 150 bases and were photosynthesis-relatedgenes. The results show the effectiveness of GeneHacker in detectingvery short CDSs corresponding to genes.  相似文献   

12.
MicroRNAs are one class of small single-stranded RNA of about 22 nt serving as important negative gene regulators. In animals, miRNAs mainly repress protein translation by binding itself to the 3′ UTR regions of mRNAs with imperfect complementary pairing. Although bioinformatics investigations have resulted in a number of target prediction tools, all of these have a common shortcoming—a high false positive rate. Therefore, it is important to further filter the predicted targets. In this paper, based on miRNA:target duplex, we construct a second-order Hidden Markov Model, implement Baum-Welch training algorithm and apply this model to further process predicted targets. The model trains the classifier by 244 positive and 49 negative miRNA:target interaction pairs and achieves a sensitivity of 72.54%, specificity of 55.10% and accuracy of 69.62% by 10-fold cross-validation experiments. In order to further verify the applicability of the algorithm, previously collected datasets, including 195 positive and 38 negative, are chosen to test it, with consistent results. We believe that our method will provide some guidance for experimental biologists, especially in choosing miRNA targets for validation.  相似文献   

13.
In hidden Markov models, the probability of observing a set of strings can be computed using recursion relations. We construct a sufficient condition for simplifying the recursion relations for a certain class of hidden Markov models. If the condition is satisfied, then one can construct a reduced recursion where the dependence on Markov states completely disappears. We discuss a specific example—namely, statistical multiple alignment based on the TKF-model—in which the sufficient condition is satisfied.  相似文献   

14.
Although a number of bacterial gene-finding programs have been developed, there is still room for improvement especially in the area of correctly detecting translation start sites. We developed a novel bacterial gene-finding program named GeneHacker Plus. Like many others, it is based on a hidden Markov model (HMM) with duration. However, it is a 'local' model in the sense that the model starts from the translation control region and ends at the stop codon of a coding region. Multiple coding regions are identified as partial paths, like local alignments in the Smith-Waterman algorithm, regardless of how they overlap. Moreover, our semiautomatic procedure for constructing the model of the translation control region allows the inclusion of an additional conserved element as well as the ribosome-binding site. We confirmed that GeneHacker Plus is one of the most accurate programs in terms of both finding potential coding regions and precisely locating translation start sites. GeneHacker Plus is also equipped with an option where the results from database homology searches are directly embedded in the HMM. Although this option does not raise the overall predictability, labeled similarity information can be of practical use. GeneHacker Plus can be accessed freely at http://elmo.ims.u-tokyo.ac.jp/GH/.  相似文献   

15.
Throughout history, the population size of modern humans has varied considerably due to changes in environment, culture, and technology. More accurate estimates of population size changes, and when they occurred, should provide a clearer picture of human colonization history and help remove confounding effects from natural selection inference. Demography influences the pattern of genetic variation in a population, and thus genomic data of multiple individuals sampled from one or more present-day populations contain valuable information about the past demographic history. Recently, Li and Durbin developed a coalescent-based hidden Markov model, called the pairwise sequentially Markovian coalescent (PSMC), for a pair of chromosomes (or one diploid individual) to estimate past population sizes. This is an efficient, useful approach, but its accuracy in the very recent past is hampered by the fact that, because of the small sample size, only few coalescence events occur in that period. Multiple genomes from the same population contain more information about the recent past, but are also more computationally challenging to study jointly in a coalescent framework. Here, we present a new coalescent-based method that can efficiently infer population size changes from multiple genomes, providing access to a new store of information about the recent past. Our work generalizes the recently developed sequentially Markov conditional sampling distribution framework, which provides an accurate approximation of the probability of observing a newly sampled haplotype given a set of previously sampled haplotypes. Simulation results demonstrate that we can accurately reconstruct the true population histories, with a significant improvement over the PSMC in the recent past. We apply our method, called diCal, to the genomes of multiple human individuals of European and African ancestry to obtain a detailed population size change history during recent times.  相似文献   

16.
17.
现有蛋白质亚细胞定位方法针对水溶性蛋白质而设计,对跨膜蛋白并不适用。而专门的跨膜拓扑预测器,又不是为亚细胞定位而设计的。文章改进了跨膜拓扑预测器TMPHMMLoc的模型结构,设计了一个新的二阶隐马尔可夫模型;采用推广到二阶模型的Baum-Welch算法估计模型参数,并把将各个亚细胞位置建立的模型整合为一个预测器。数据集上测试结果表明,此方法性能显著优于针对可溶性蛋白设计的支持向量机方法和模糊k最邻近方法,也优于TMPHMMLoc中提出的隐马尔可夫模型方法,是一个有效的跨膜蛋白亚细胞定位预测方法。  相似文献   

18.
针对传统基因剪接位点识别方法具有所用到的序列长,且参数多的问题,论文提出了一种基于KL距离的变长马尔可夫模型(Kullback Leibler divergence-variable length Markovmodel,KL-VLMM)。该模型在变长马尔可夫模型的基础上进行改进,由KL距离代替原来的概率比值来判断序列扩展的方向,有效地提高了特征序列的识别能力,且模型阶数由二阶降为一阶,降低了算法的空间复杂度。利用人类剪接位点数据库N269,对该模型和其他传统方法的识别性能进行了比较。实验结果表明,采用KL-VLMM方法预测人类基因剪接位点的预测效果更好。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号