首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 109 毫秒
1.
一个新的核酸序列比对算法及其在序列全局比对中的应用   总被引:1,自引:0,他引:1  
目前在序列比对中所广泛使用的动态规划算法,虽然能达到最优比对结果,但却由于具有高计算复杂度O(N_2)而极大地降低了计算效率。将多阶段动态规划决策算法用于两两序列比对并用Visual BASIC编程实现,结果发现该新算法在将计算复杂度减小到O(N)的同时,也能够获得较为理想的计算精度,预期将在序列全局比对中起重要作用。  相似文献   

2.
张林  柴惠  沃立科  袁小凤  黄燕芬 《生物信息学》2011,9(2):146-150,154
生物序列比对是生物信息学的基础,是当今功能基因组学研究中最常用、最重要的研究方法之一。本文对各类序列比对算法优缺点进行分析,对图形硬件的优势进行挖掘。在此基础上,将各类序列比对算法中准确性最高的动态规划算法予以实现,并将其映射到图形硬件上,以实现算法加速。通过实例进行性能评测,结果表明该加速算法在保证比对准确性的同时,能较大地提高比对速度。  相似文献   

3.
张林 《生物信息学》2014,12(3):179-184
为探索准确、高效、低成本、通用性并存的生物序列局部比对方法。将点阵图算法、启发式算法等各种序列局部比对算法中准确性最高的动态规划局部比对算法在计算机中实现,并通过流式模型将其映射到图形硬件上以实现算法加速,再通过实例比对搜索数据库完成比对时间和每秒百万次格点更新(MCUPS)性能值评测。结果表明,该加速算法在保证比对准确性的同时,能显著提升比对速度。与目前最快的启发式算法相比,比对平均加速为14.5倍,最高加速可达22.9倍。  相似文献   

4.
多序列比对是生物信息学中重要的基础研究内容,对各种RNA序列分析方法而言,这也是非常重要的一步。不像DNA和蛋白质,许多功能RNA分子的序列保守性要远差于其结构的保守性,因此,对RNA的分析研究要求其多序列比对不仅要考虑序列信息,而且要充分考虑到其结构信息。本文提出了一种考虑了结构信息的同源RNA多序列比对算法,它先利用热力学方法计算出每条序列的配对概率矩阵,得到结构信息,由此构造各条序列的结构信息矢量,结合传统序列比对方法,提出优化目标函数,采用动态规划算法和渐进比对得到最后的多序列比对。试验证实该方法的有效性。  相似文献   

5.
首先介绍序列比对的分子生物学基础,即核酸序列基本单元核苷酸和蛋白质序列基本单元氨基酸。文中以精心设计的图表列出四种核苷酸和二十种氨基酸的名称、性质和分类。第2节简述序列比对基础,包括相似性和同源性基本概念、整体比对和局部比对、点阵图方法、动态规划和启发式算法、计分矩阵和空位罚分,以及常用软件和分析平台。第3节介绍核酸序列比对中常用计分矩阵DNAfull,蛋白质序列比对中常用计分矩阵BLOSUM62和PAM250。第4-8节则以血红蛋白、多肽毒素、植物转录因子、癌胚抗原和唾液酸酶为例,介绍双序列比对的具体应用。通过这些实例,说明如何选择分析平台和比对程序、如何设置计分矩阵和空位罚分,如何分析比对结果及其生物学意义。文末进行简要总结。  相似文献   

6.
在生物信息学研究中,生物序列比对问题占有重要的地位。多序列比对问题是一个NPC问题,由于时间和空间的限制不能够求出精确解。文中简要介绍了Feng和Doolittle提出的多序列比对算法的基本思想,并改进了该算法使之具有更好的比对精度。实验结果表明,新算法对解决一般的progressive多序列比对方法中遇到的局部最优问题有较好的效果。  相似文献   

7.
在DNA序列相似性的研究中,通常采用的动态规划算法对空位罚分函数缺乏理论依据而带有主观性,从而取得不同的结果,本文提出了一种基于DTW(Dynamic Time Warping,动态时间弯曲)距离的DNA序列相似性度量方法可以解决这一问题.通过DNA序列的图形表示把DNA序列转化为时间序列,然后计算DTW距离来度量序列相似度以表征DNA序列属性,得到能够比较DNA序列相似性度量方法,并用这个方法比较分析了七种东亚钳蝎神经毒素(Buthusmartensi Karsch neurotoxin)基因序列的相似性,验证了该度量方法的有效性和准确性.  相似文献   

8.
一种用于蛋白质相似性分析的新的相对距离   总被引:1,自引:0,他引:1  
本文论述了一种新的相对距离,用于分析不同蛋白质序列的相似性分析和构造进化树.此种距离基于Lempel-Zip复杂度,不需要进行序列比对和复杂性算法.为了说明这种距离的合理性,本文对8个物种进行了相似性分析并构造了其进化树.  相似文献   

9.
序列比对是生物信息学研究的一个重要工具,它在序列拼接、蛋白质结构预测、蛋白质结构功能分析、系统进化分析、数据库检索以及引物设计等问题的研究中被广泛使用。本文详细介绍了在生物信息学中常用的一些序列比对算法,比较了这些算法所需的计算复杂度,优缺点,讨论了各自的使用范围,并指出今后序列比对研究的发展方向。  相似文献   

10.
基于量子进化算法的RNA序列-结构比对   总被引:1,自引:0,他引:1  
多序列比对是计算分子生物学的经典问题,也是许多生物学研究的重要基础步骤.RNA作为生物大分子的一种,不同于蛋白质和DNA,其二级结构在进化过程中比初级序列更保守,因此要求在RNA序列比对中不仅要考虑序列信息,更要着重考虑二级结构信息.提出了一种基于量子进化算法的RNA多序列-结构比对程序,对RNA序列进行了量子编码,设计了考虑进结构信息的全交叉算子,提出了适合于进行RNA序列-结构比对的适应度函数,克服了传统进化算法收敛速度慢和早熟问题.在标准数据库上的测试,证实了方法的有效性.  相似文献   

11.
Multiple sequence alignment (MSA) is one of the most fundamental problems in computational molecular biology. The running time of the best known scheme for finding an optimal alignment, based on dynamic programming, increases exponentially with the number of input sequences. Hence, many heuristics were suggested for the problem. We consider a version of the MSA problem where the goal is to find an optimal alignment in which matches are restricted to positions in predefined matching segments. We present several techniques for making the dynamic programming algorithm more efficient, while still finding an optimal solution under these restrictions. We prove that it suffices to find an optimal alignment of the predefined sequence segments, rather than single letters, thereby reducing the input size and thus improving the running time. We also identify "shortcuts" that expedite the dynamic programming scheme. Empirical study shows that, taken together, these observations lead to an improved running time over the basic dynamic programming algorithm by 4 to 12 orders of magnitude, while still obtaining an optimal solution. Under the additional assumption that matches between segments are transitive, we further improve the running time for finding the optimal solution by restricting the search space of the dynamic programming algorithm  相似文献   

12.
The major algorithms currently used for aligning biological sequences are those based on dynamic programming method. A dynamic programming algorithm consists of two major procedures, forward and traceback routines. This paper describes a dynamic programming algorithm for aligning three sequences at a time. Deletions and insertions are penalized according to their numbers and lengths. A forward process is accomplished in O(L3) computational steps, where L is the average sequence length. On the other hand, a traceback process is done in T steps, where T is the number of elementary configurations involved in the optimal alignment (usually T much less than L). The traceback procedure uses an effective technique for memory management, which is applicable to a wide range of sequence-matching methods.  相似文献   

13.
Protein sequence alignment has become an essential task in modern molecular biology research. A number of alignment techniques have been documented in literature and their corresponding tools are made available as freeware and commercial software. The choice and use of these tools for sequence alignment through the complete interpretation of alignment results is often considered non-trivial by end-users with limited skill in Bioinformatics algorithm development. Here, we discuss the comparison of sequence alignment techniques based on dynamic programming (N-W, S-W) and heuristics (LFASTA, BL2SEQ) for four sets of sequence data towards an educational purpose. The analysis suggests that heuristics based methods are faster than dynamic programming methods in alignment speed.  相似文献   

14.
A molecular sequence alignment algorithm based on dynamic programming has been extended to allow the computation of all pairs of residues that can be part of optimal and suboptimal sequence alignments. The uncertainties inherent in sequence alignment can be displayed using a new form of dot plot. The method allows the qualitative assessment of whether or not two sequences are related, and can reveal what parts of the alignment are better determined than others. It also permits the computation of representative optimal and suboptimal alignments. The relation between alignment reliability and alignment parameters is discussed. Other applications are to cyclical permutations of sequences and the detection of self-similarity. An application to multiple sequence alignment is noted.  相似文献   

15.
IIntMuctiona习nenC6allpoent13asenondynamiCpmpCgIsthemostWidely11。dllethed11。-quencecompgnsonatpresent.Wbenmpingon18I’ge一切degenomempence肛dyslswiththiskindofmethed,wefacetwomperdifficulties,the18ig6stompandtheIOllgmptationaltdrie.My。。dMill。[“spplyHi。比那’stecheniqJ‘、mpen。alipentpwhl。,wb。dgofl山mconsumeSpaceMypZ’Oportlonaltothesumd山eapuencelmphs.AnewpIOgTgnSIM”,utilizingthealgorithm,hasbeenueding。eequ。ceallpoent.How。,themptationaltimebySIMisstilltoolO…  相似文献   

16.
We have developed simulated annealing algorithms to solve theproblem of multiple sequence alignment. The algorithm wns shownto give the optimal solution as confirmed by the rigorous dynamicprogramming algorithm for three-sequence alignment. To overcomelong execution times for simulated annealing, we utilized aparallel computer. A sequential algorithm, a simple parallelalgorithm and the temperature parallel algorithm were testedon a problem. The results were compared with the result obtainedby a conventional tree-based algorithm where alignments weremerged by two-' dynamic programming. Every annealing algorithmproduced a better energy value than the conventional algorithm.The best energy value, which probably represents the optimalsolution, wns reached within a reasonable time by both of theparallel annealing algorithms. We consider the temperature parallelalgorithm of simulated annealing to be the most suitable forfinding the optimal multiple sequence alignment because thealgorithm does not require any scheduling for optimization.The algorithm is also usefiui for refining multiple alignmentsobtained by other hewistic methods.  相似文献   

17.
A new approach to sequence comparison: normalized sequence alignment   总被引:3,自引:0,他引:3  
The Smith-Waterman algorithm for local sequence alignment is one of the most important techniques in computational molecular biology. This ingenious dynamic programming approach was designed to reveal the highly conserved fragments by discarding poorly conserved initial and terminal segments. However, the existing notion of local similarity has a serious flaw: it does not discard poorly conserved intermediate segments. The Smith-Waterman algorithm finds the local alignment with maximal score but it is unable to find local alignment with maximum degree of similarity (e.g. maximal percent of matches). Moreover, there is still no efficient algorithm that answers the following natural question: do two sequences share a (sufficiently long) fragment with more than 70% of similarity? As a result, the local alignment sometimes produces a mosaic of well-conserved fragments artificially connected by poorly-conserved or even unrelated fragments. This may lead to problems in comparison of long genomic sequences and comparative gene prediction as recently pointed out by Zhang et al. (Bioinformatics, 15, 1012-1019, 1999). In this paper we propose a new sequence comparison algorithm (normalized local alignment ) that reports the regions with maximum degree of similarity. The algorithm is based on fractional programming and its running time is O(n2log n). In practice, normalized local alignment is only 3-5 times slower than the standard Smith-Waterman algorithm.  相似文献   

18.
Recomputation of the previously evaluated similarity results between biological sequences becomes inevitable when researchers realize errors in their sequenced data or when the researchers have to compare nearly similar sequences, e.g., in a family of proteins. We present an efficient scheme for updating local sequence alignments with an affine gap model. In principle, using the previous matching result between two amino acid sequences, we perform a forward-backward alignment to generate heuristic searching bands which are bounded by a set of suboptimal paths. Given a correctly updated sequence, we initially predict a new score of the alignment path for each contour to select the best candidates among them. Then, we run the Smith-Waterman algorithm in this confined space. Furthermore, our heuristic alignment for an updated sequence shows that it can be further accelerated by using reusable dynamic programming (rDP), our prior work. In this study, we successfully validate "relative node tolerance bound” (RNTB) in the pruned searching space. Furthermore, we improve the computational performance by quantifying the successful RNTB tolerance probability and switch to rDP on perturbation-resilient columns only. In our searching space derived by a threshold value of 90 percent of the optimal alignment score, we find that 98.3 percent of contours contain correctly updated paths. We also find that our method consumes only 25.36 percent of the runtime cost of sparse dynamic programming (sDP) method, and to only 2.55 percent of that of a normal dynamic programming with the Smith-Waterman algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号