首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 78 毫秒
1.
多序列比对是生物信息学中基础而又重要的序列分析方法.本文提出一种新的多序列比对算法,该算法综合了渐进比对方法和迭代策略,采用加权函数以调整序列的有偏分布,用neighbor-joining方法构建指导树以确定渐进比对的顺序.通过对BAlibASE中142组蛋白质序列比对的测试,验证了本算法的有效性.与Multalin算法比较的结果表明,本算法能有效地提高分歧较大序列的比对准确率.  相似文献   

2.
多序列比对在阐明一组相关序列的重要生物学模式方面起着十分重要的作用。自从计算机的出现,就有许多研究者致力于多序列比对算法。人类基因组计划和单体型计划使多序列比对研究再次成为研究热点。本文详细归纳了多序列比对的主要算法,总结了国内外近年来多序列比对的研究进展,同时也分析并预测了未来该问题的研究方向。  相似文献   

3.
在生物信息学研究中,生物序列比对问题占有重要的地位。多序列比对问题是一个NPC问题,由于时间和空间的限制不能够求出精确解。文中简要介绍了Feng和Doolittle提出的多序列比对算法的基本思想,并改进了该算法使之具有更好的比对精度。实验结果表明,新算法对解决一般的progressive多序列比对方法中遇到的局部最优问题有较好的效果。  相似文献   

4.
曹阳 《生物学通报》2005,40(1):11-12
多序列比对能够揭示出一系列DNA或蛋白质序列之间的关系,发现序列间的保守区域主要介绍了几种较为常用的多序列比对程序及其使用技巧.  相似文献   

5.
序列比对是生物信息学研究的一个重要工具,它在序列拼接、蛋白质结构预测、蛋白质结构功能分析、系统进化分析、数据库检索以及引物设计等问题的研究中被广泛使用。本文详细介绍了在生物信息学中常用的一些序列比对算法,比较了这些算法所需的计算复杂度,优缺点,讨论了各自的使用范围,并指出今后序列比对研究的发展方向。  相似文献   

6.
为了解决生物信息学中基因多序列比对的计算速度慢和软件陈旧的问题,提出了基于Yarn(Yet Another Resource Negotiator)云平台的生物基因多序列比对并行计算方法Yarn_clustalW。分析了clustalW算法的数学模型及其面向MapReduce的任务划分方式,Yarn_clustalW中综合考虑了基因的长度和数目,采用一种基于阈值刻度的任务划分方式。利用NCBI的GenBank生物基因数据作为案例程序进行了测试。实验结果表明:Yarn_clustalW比起多序列比对clustalW串行计算方法具有更快的运行时间与加速比,可以使生物科研人员节省很多时间与精力,方便对于药物靶标的发现,缩短生物药物的开发周期。  相似文献   

7.
《生命科学研究》2014,(5):458-464
高通量测序技术的飞速发展,给生物信息学带来了新的机遇和挑战,第二代测序序列数量多、长度短使得原来的序列分析手段不再适用。近几年来,针对高通量测序的序列分析算法和软件日益增多,目前已有上百种,导致选择合适的软件成为一个难题。对第二代测序的测序类型、序列类型以及分析算法进行了总结和归纳,对现今常用的分析软件的序列的类型、长度以及软件应用算法、输入/输出格式、特点和功能等方面做了详细分析和比较并给出建议。分析了现今测序技术和序列分析存在的问题,预测了今后的发展方向。  相似文献   

8.
序列比对是生物信息学中的一项重要任务,通过序列比对可以发现生物序列中的功能、结构和进化的信息。序列比对结果的生物学意义与所选择的匹配、不匹配、插入和删除以及空隙的罚分函数密切相关。现介绍一种参数序列比对方法,该方法把最佳比对作为权值和罚分的函数,可以系统地得到参数的选择对最佳比对结果的影响。然后将其应用于RNA序列比对,分析不同的参数选择对序列比对结果的影响。最后指出参数序列比对算法的应用以及未来的发展方向。  相似文献   

9.
一个新的核酸序列比对算法及其在序列全局比对中的应用   总被引:1,自引:0,他引:1  
目前在序列比对中所广泛使用的动态规划算法,虽然能达到最优比对结果,但却由于具有高计算复杂度O(N_2)而极大地降低了计算效率。将多阶段动态规划决策算法用于两两序列比对并用Visual BASIC编程实现,结果发现该新算法在将计算复杂度减小到O(N)的同时,也能够获得较为理想的计算精度,预期将在序列全局比对中起重要作用。  相似文献   

10.
蚁群遗传算法是在蚁群算法的基础上用遗传算法对其参数进行优化而产生的一种改进算法。把蚁群遗传算法应用于DNA序列比对上,结果表明这种新的序列比对算法是非常有效的。  相似文献   

11.
Clustal W—蛋白质与核酸序列分析软件   总被引:2,自引:1,他引:2  
蛋白质与核酸的序列分析在现代生物学和生物信息学中发挥着重要作用,新的算法和软件层出不穷,本文介绍一个可运行在PC机上的完全免费的多序列比较软件-ClustalW,它不但可以进行蛋白质与核酸的多序列比较,分析不同序列之间的相似性关系,还可以绘制进化树。由于其灵活的输入输出格式、方便的参数设定和选择、详尽的在线帮助以及良好的可移植性,使得ClustalW在蛋白质与核酸的序列分析中得到了广泛应用。  相似文献   

12.
Multiple sequence alignments are fundamental to many sequence analysis methods. Most alignments are computed using the progressive alignment heuristic. These methods are starting to become a bottleneck in some analysis pipelines when faced with data sets of the size of many thousands of sequences. Some methods allow computation of larger data sets while sacrificing quality, and others produce high‐quality alignments, but scale badly with the number of sequences. In this paper, we describe a new program called Clustal Omega, which can align virtually any number of protein sequences quickly and that delivers accurate alignments. The accuracy of the package on smaller test cases is similar to that of the high‐quality aligners. On larger data sets, Clustal Omega outperforms other packages in terms of execution time and quality. Clustal Omega also has powerful features for adding sequences to and exploiting information in existing alignments, making use of the vast amount of precomputed information in public databases like Pfam.  相似文献   

13.
Protein multiple sequence alignment is an important bioinformatics tool. It has important applications in biological evolution analysis and protein structure prediction. A variety of alignment algorithms in this field have achieved great success. However, each algorithm has its own inherent deficiencies. In this paper, permutation similarity is proposed to evaluate several protein multiple sequence alignment algorithms that are widely used currently. As the permutation similarity method only concerns the relative order of different protein evolutionary distances, without taking into account the slight difference between the evolutionary distances, it can get more robust evaluations. The longest common subsequence method is adopted to define the similarity between different permutations. Using these methods, we assessed Dialign, Tcoffee, ClustalW and Muscle and made comparisons among them.  相似文献   

14.
The many faces of sequence alignment   总被引:9,自引:0,他引:9  
Starting with the sequencing of the mouse genome in 2002, we have entered a period where the main focus of genomics will be to compare multiple genomes in order to learn about human biology and evolution at the DNA level. Alignment methods are the main computational component of this endeavour. This short review aims to summarise the current status of research in alignments, emphasising large-scale genomic comparisons and suggesting possible directions that will be explored in the near future.  相似文献   

15.
Shatsky M  Nussinov R  Wolfson HJ 《Proteins》2006,62(1):209-217
Routinely used multiple-sequence alignment methods use only sequence information. Consequently, they may produce inaccurate alignments. Multiple-structure alignment methods, on the other hand, optimize structural alignment by ignoring sequence information. Here, we present an optimization method that unifies sequence and structure information. The alignment score is based on standard amino acid substitution probabilities combined with newly computed three-dimensional structure alignment probabilities. The advantage of our alignment scheme is in its ability to produce more accurate multiple alignments. We demonstrate the usefulness of the method in three applications: 1) computing more accurate multiple-sequence alignments, 2) analyzing protein conformational changes, and 3) computation of amino acid structure-sequence conservation with application to protein-protein docking prediction. The method is available at http://bioinfo3d.cs.tau.ac.il/staccato/.  相似文献   

16.
SNUFER is a software for the automatic localization and generation of tables used for the presentation of single nucleotide polymorphisms (SNPs). After input of a fasta file containing the sequences to be analyzed, a multiple sequence alignment is generated using ClustalW ran inside SNUFER. The ClustalW output file is then used to generate a table which displays the SNPs detected in the aligned sequences and their degree of similarity. This table can be exported to Microsoft Word, Microsoft Excel or as a single text file, permitting further editing for publication. The software was written using Delphi 7 for programming and FireBird 2.0 for sequence database management. It is freely available for noncommercial use and can be downloaded from http://www.heranza.com.br/bioinformatica2.htm.  相似文献   

17.
The traditional approach to bioinformatics analyses relies onindependent task-specific services and applications, using differentinput and output formats, often idiosyncratic, and frequentlynot designed to inter-operate. In general, such analyses wereperformed by experts who manually verified the results obtainedat each step in the process. Today, the amount of bioinformaticsinformation continuously being produced means that handlingthe various applications used to study this information presentsa major data management and analysis challenge to researchers.It is now impossible to manually analyse all this informationand new approaches are needed that are capable of processingthe large-scale heterogeneous data in order to extract the pertinentinformation. We review the recent use of integrated expert systemsaimed at providing more efficient knowledge extraction for bioinformaticsresearch. A general methodology for building knowledge-basedexpert systems is described, focusing on the unstructured informationmanagement architecture, UIMA, which provides facilities forboth data and process management. A case study involving a multiplealignment expert system prototype called AlexSys is also presented.   相似文献   

18.
A novel method has been developed for acquiring the correct alignment of a query sequence against remotely homologous proteins by extracting structural information from profiles of multiple structure alignment. A systematic search algorithm combined with a group of score functions based on sequence information and structural information has been introduced in this procedure. A limited number of top solutions (15,000) with high scores were selected as candidates for further examination. On a test-set comprising 301 proteins from 75 protein families with sequence identity less than 30%, the proportion of proteins with completely correct alignment as first candidate was improved to 39.8% by our method, whereas the typical performance of existing sequence-based alignment methods was only between 16.1% and 22.7%. Furthermore, multiple candidates for possible alignment were provided in our approach, which dramatically increased the possibility of finding correct alignment, such that completely correct alignments were found amongst the top-ranked 1000 candidates in 88.3% of the proteins. With the assistance of a sequence database, completely correct alignment solutions were achieved amongst the top 1000 candidates in 94.3% of the proteins. From such a limited number of candidates, it would become possible to identify more correct alignment using a more time-consuming but more powerful method with more detailed structural information, such as side-chain packing and energy minimization, etc. The results indicate that the novel alignment strategy could be helpful for extending the application of highly reliable methods for fold identification and homology modeling to a huge number of homologous proteins of low sequence similarity. Details of the methods, together with the results and implications for future development are presented.  相似文献   

19.
We have developed a phylogeny-aware progressive alignment method that recognizes insertions and deletions as distinct evolutionary events and thus avoids systematic errors created by traditional alignment methods. We now extend this method to simultaneously model regional heterogeneity and evolution. This novel method can be flexibly adapted to alignment of nucleotide or amino acid sequences evolving under processes that vary over genomic regions and, being fully probabilistic, provides an estimate of regional heterogeneity of the evolutionary process along the alignment and a measure of local reliability of the solution. Furthermore, the evolutionary modelling of substitution process permits adjusting the sensitivity and specificity of the alignment and, if high specificity is aimed at, leaving sequences unaligned when their divergence is beyond a meaningful detection of homology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号