首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 153 毫秒
1.
张林 《生物信息学》2014,12(3):179-184
为探索准确、高效、低成本、通用性并存的生物序列局部比对方法。将点阵图算法、启发式算法等各种序列局部比对算法中准确性最高的动态规划局部比对算法在计算机中实现,并通过流式模型将其映射到图形硬件上以实现算法加速,再通过实例比对搜索数据库完成比对时间和每秒百万次格点更新(MCUPS)性能值评测。结果表明,该加速算法在保证比对准确性的同时,能显著提升比对速度。与目前最快的启发式算法相比,比对平均加速为14.5倍,最高加速可达22.9倍。  相似文献   

2.
一个新的核酸序列比对算法及其在序列全局比对中的应用   总被引:1,自引:0,他引:1  
目前在序列比对中所广泛使用的动态规划算法,虽然能达到最优比对结果,但却由于具有高计算复杂度O(N_2)而极大地降低了计算效率。将多阶段动态规划决策算法用于两两序列比对并用Visual BASIC编程实现,结果发现该新算法在将计算复杂度减小到O(N)的同时,也能够获得较为理想的计算精度,预期将在序列全局比对中起重要作用。  相似文献   

3.
在生物信息学研究中,生物序列比对问题占有重要的地位。多序列比对问题是一个NPC问题,由于时间和空间的限制不能够求出精确解。文中简要介绍了Feng和Doolittle提出的多序列比对算法的基本思想,并改进了该算法使之具有更好的比对精度。实验结果表明,新算法对解决一般的progressive多序列比对方法中遇到的局部最优问题有较好的效果。  相似文献   

4.
BLASTALIGN在同源基因片段检索中的应用   总被引:1,自引:1,他引:0  
多序列比对程序的开发在生物信息学中是一个很活跃的研究领域.作为一种新诞生的多序列比对程序,BlastAlign利用blastn算法进行序列比对,并且采用空位替代比对结果中的高变区.目前,该程序主要用于多条基因序列的比对研究.本文则以获取膜翅目小蜂总科28S rDNA D2区基因片段为例,通过应用BlastAlign的序列比对过程,提供一种较为简便和有效的方法,实现从GenBank数据库中检索并筛选特定的同源基因序列,从而克服目前利用关键词或检索式进行检索所存在的局限性.  相似文献   

5.
基于动态规划的快速序列比对算法   总被引:3,自引:0,他引:3  
序列比对算法是生物信息学中重要的研究方向之一,而动态规划法是序列比对算法中最有效最基本的方法.由于原有的基本动态规划方法时间和空间复杂度大,不适合实际的生物序列比对,因此本文在分析介绍几种相关动态规划算法的基础上,提出了一种基于动态规划的快速序列比对算法UKK_FA.实验结果表明,该算法有效地降低了时间复杂度,具有一定的实用性。  相似文献   

6.
序列比对是生物信息学研究的一个重要工具,它在序列拼接、蛋白质结构预测、蛋白质结构功能分析、系统进化分析、数据库检索以及引物设计等问题的研究中被广泛使用。本文详细介绍了在生物信息学中常用的一些序列比对算法,比较了这些算法所需的计算复杂度,优缺点,讨论了各自的使用范围,并指出今后序列比对研究的发展方向。  相似文献   

7.
蚁群遗传算法是在蚁群算法的基础上用遗传算法对其参数进行优化而产生的一种改进算法。把蚁群遗传算法应用于DNA序列比对上,结果表明这种新的序列比对算法是非常有效的。  相似文献   

8.
基于量子进化算法的RNA序列-结构比对   总被引:1,自引:0,他引:1  
多序列比对是计算分子生物学的经典问题,也是许多生物学研究的重要基础步骤.RNA作为生物大分子的一种,不同于蛋白质和DNA,其二级结构在进化过程中比初级序列更保守,因此要求在RNA序列比对中不仅要考虑序列信息,更要着重考虑二级结构信息.提出了一种基于量子进化算法的RNA多序列-结构比对程序,对RNA序列进行了量子编码,设计了考虑进结构信息的全交叉算子,提出了适合于进行RNA序列-结构比对的适应度函数,克服了传统进化算法收敛速度慢和早熟问题.在标准数据库上的测试,证实了方法的有效性.  相似文献   

9.
多序列比对是生物信息学中基础而又重要的序列分析方法.本文提出一种新的多序列比对算法,该算法综合了渐进比对方法和迭代策略,采用加权函数以调整序列的有偏分布,用neighbor-joining方法构建指导树以确定渐进比对的顺序.通过对BAlibASE中142组蛋白质序列比对的测试,验证了本算法的有效性.与Multalin算法比较的结果表明,本算法能有效地提高分歧较大序列的比对准确率.  相似文献   

10.
序列比对是生物信息学中最常用和最经典的研究手段。生物序列比对需要有强大计算能力的硬件支撑,而近年快速发展起来的GPGPU正好可堪此任。本文首先介绍GPGPU的发展过程,进而讲述GPGPU硬件设备与其编程环境,然后对GPGPU做科学计算时需要的数学库函数做一介绍,最后综述近年来国内外基于GPGPU的生物序列比对软件和相关研究工作,并总结和展望其辉煌前景。  相似文献   

11.

Motivation

To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis.

Results

With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.  相似文献   

12.
We have developed simulated annealing algorithms to solve theproblem of multiple sequence alignment. The algorithm wns shownto give the optimal solution as confirmed by the rigorous dynamicprogramming algorithm for three-sequence alignment. To overcomelong execution times for simulated annealing, we utilized aparallel computer. A sequential algorithm, a simple parallelalgorithm and the temperature parallel algorithm were testedon a problem. The results were compared with the result obtainedby a conventional tree-based algorithm where alignments weremerged by two-' dynamic programming. Every annealing algorithmproduced a better energy value than the conventional algorithm.The best energy value, which probably represents the optimalsolution, wns reached within a reasonable time by both of theparallel annealing algorithms. We consider the temperature parallelalgorithm of simulated annealing to be the most suitable forfinding the optimal multiple sequence alignment because thealgorithm does not require any scheduling for optimization.The algorithm is also usefiui for refining multiple alignmentsobtained by other hewistic methods.  相似文献   

13.
MOTIVATION: Multiple sequence alignment is an important tool in computational biology. In order to solve the task of computing multiple alignments in affordable time, the most commonly used multiple alignment methods have to use heuristics. Nevertheless, the computation of optimal multiple alignments is important in its own right, and it provides a means of evaluating heuristic approaches or serves as a subprocedure of heuristic alignment methods. RESULTS: We present an algorithm that uses the divide-and-conquer alignment approach together with recent results on search space reduction to speed up the computation of multiple sequence alignments. The method is adaptive in that depending on the time one wants to spend on the alignment, a better, up to optimal alignment can be obtained. To speed up the computation in the optimal alignment step, we apply the alpha(*) algorithm which leads to a procedure provably more efficient than previous exact algorithms. We also describe our implementation of the algorithm and present results showing the effectiveness and limitations of the procedure.  相似文献   

14.
Multiple alignments among genomes are becoming increasingly prevalent. This trend motivates the development of tools for efficient homology search between a query sequence and a database of multiple alignments. In this paper, we present an algorithm that uses the information implicit in a multiple alignment to dynamically build an index that is weighted most heavily towards the promising regions of the multiple alignment. We have implemented Typhon, a local alignment tool that incorporates our indexing algorithm, which our test results show to be more sensitive than algorithms that index only a sequence. This suggests that when applied on a whole-genome scale, Typhon should provide improved homology searches in time comparable to existing algorithms.  相似文献   

15.
A comprehensive comparison of multiple sequence alignment programs.   总被引:35,自引:4,他引:31  
In recent years improvements to existing programs and the introduction of new iterative algorithms have changed the state-of-the-art in protein sequence alignment. This paper presents the first systematic study of the most commonly used alignment programs using BAliBASE benchmark alignments as test cases. Even below the 'twilight zone' at 10-20% residue identity, the best programs were capable of correctly aligning on average 47% of the residues. We show that iterative algorithms often offer improved alignment accuracy though at the expense of computation time. A notable exception was the effect of introducing a single divergent sequence into a set of closely related sequences, causing the iteration to diverge away from the best alignment. Global alignment programs generally performed better than local methods, except in the presence of large N/C-terminal extensions and internal insertions. In these cases, a local algorithm was more successful in identifying the most conserved motifs. This study enables us to propose appropriate alignment strategies, depending on the nature of a particular set of sequences. The employment of more than one program based on different alignment techniques should significantly improve the quality of automatic protein sequence alignment methods. The results also indicate guidelines for improvement of alignment algorithms.  相似文献   

16.
Sequence alignment by cross-correlation.   总被引:1,自引:0,他引:1  
Many recent advances in biology and medicine have resulted from DNA sequence alignment algorithms and technology. Traditional approaches for the matching of DNA sequences are based either on global alignment schemes or heuristic schemes that seek to approximate global alignment algorithms while providing higher computational efficiency. This report describes an approach using the mathematical operation of cross-correlation to compare sequences. It can be implemented using the fast fourier transform for computational efficiency. The algorithm is summarized and sample applications are given. These include gene sequence alignment in long stretches of genomic DNA, finding sequence similarity in distantly related organisms, demonstrating sequence similarity in the presence of massive (approximately 90%) random point mutations, comparing sequences related by internal rearrangements (tandem repeats) within a gene, and investigating fusion proteins. Application to RNA and protein sequence alignment is also discussed. The method is efficient, sensitive, and robust, being able to find sequence similarities where other alignment algorithms may perform poorly.  相似文献   

17.
Several recent publications illustrated advantages of using sequence profiles in recognizing distant homologies between proteins. At the same time, the practical usefulness of distant homology recognition depends not only on the sensitivity of the algorithm, but also on the quality of the alignment between a prediction target and the template from the database of known proteins. Here, we study this question for several supersensitive protein algorithms that were previously compared in their recognition sensitivity (Rychlewski et al., 2000). A database of protein pairs with similar structures, but low sequence similarity is used to rate the alignments obtained with several different methods, which included sequence-sequence, sequence-profile, and profile-profile alignment methods. We show that incorporation of evolutionary information encoded in sequence profiles into alignment calculation methods significantly increases the alignment accuracy, bringing them closer to the alignments obtained from structure comparison. In general, alignment quality is correlated with recognition and alignment score significance. For every alignment method, alignments with statistically significant scores correlate with both correct structural templates and good quality alignments. At the same time, average alignment lengths differ in various methods, making the comparison between them difficult. For instance, the alignments obtained by FFAS, the profile-profile alignment algorithm developed in our group are always longer that the alignments obtained with the PSI-BLAST algorithms. To address this problem, we develop methods to truncate or extend alignments to cover a specified percentage of protein lengths. In most cases, the elongation of the alignment by profile-profile methods is reasonable, adding fragments of similar structure. The examples of erroneous alignment are examined and it is shown that they can be identified based on the model quality.  相似文献   

18.
Multiple sequence alignments (MSAs) have become one of the most studied approaches in bioinformatics to perform other outstanding tasks such as structure prediction, biological function analysis or next-generation sequencing. However, current MSA algorithms do not always provide consistent solutions, since alignments become increasingly difficult when dealing with low similarity sequences. As widely known, these algorithms directly depend on specific features of the sequences, causing relevant influence on the alignment accuracy. Many MSA tools have been recently designed but it is not possible to know in advance which one is the most suitable for a particular set of sequences. In this work, we analyze some of the most used algorithms presented in the bibliography and their dependences on several features. A novel intelligent algorithm based on least square support vector machine is then developed to predict how accurate each alignment could be, depending on its analyzed features. This algorithm is performed with a dataset of 2180 MSAs. The proposed system first estimates the accuracy of possible alignments. The most promising methodologies are then selected in order to align each set of sequences. Since only one selected algorithm is run, the computational time is not excessively increased.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号