首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
在生物信息学研究中,生物序列比对问题占有重要的地位。多序列比对问题是一个NPC问题,由于时间和空间的限制不能够求出精确解。文中简要介绍了Feng和Doolittle提出的多序列比对算法的基本思想,并改进了该算法使之具有更好的比对精度。实验结果表明,新算法对解决一般的progressive多序列比对方法中遇到的局部最优问题有较好的效果。  相似文献   

2.
In this study, we address the meta-task scheduling problem in heterogeneous computing (HC) systems, which is to find a task assignment that minimizes the schedule length of a meta-task composed of several independent tasks with no data dependencies. The fact that the meta-task scheduling problem in HC systems is NP-hard has motivated the development of many heuristic scheduling algorithms. These heuristic algorithms, however, neglect the stochastic nature of task execution times in an attempt to minimize a deterministic objective function, which is the maximum of the expected values of machine loads. Contrary to existing heuristics, we account for this stochastic nature by modeling task execution times as random variables. We, then, formulate a stochastic scheduling problem where the objective is to minimize the expected value of the maximum of machine loads. We prove that this new objective is underestimated by the deterministic objective function and that an optimal task assignment obtained with respect to the deterministic objective function could be inefficient in a real computing platform. In order to solve the stochastic scheduling problem posed, we develop a genetic algorithm based scheduling heuristic. Our extensive simulation studies show that the proposed genetic algorithm can produce better task assignments as compared to existing heuristics. Specifically, we observe a performance improvement on the relative cost heuristic (M.-Y. Wu and W. Shu, A high-performance mapping algorithm for heterogeneous computing systems, in: Int. Parallel and Distributed Processing Symposium, San Francisco, CA, April 2001) by up to 61%.  相似文献   

3.
We present HotKnots, a new heuristic algorithm for the prediction of RNA secondary structures including pseudoknots. Based on the simple idea of iteratively forming stable stems, our algorithm explores many alternative secondary structures, using a free energy minimization algorithm for pseudoknot free secondary structures to identify promising candidate stems. In an empirical evaluation of the algorithm with 43 sequences taken from the Pseudobase database and from the literature on pseudoknotted structures, we found that overall, in terms of the sensitivity and specificity of predictions, HotKnots outperforms the well-known Pseudoknots algorithm of Rivas and Eddy and the NUPACK algorithm of Dirks and Pierce, both based on dynamic programming approaches for limited classes of pseudoknotted structures. It also outperforms the heuristic Iterated Loop Matching algorithm of Ruan and colleagues, and in many cases gives better results than the genetic algorithm from the STAR package of van Batenburg and colleagues and the recent pknotsRG-mfe algorithm of Reeder and Giegerich. The HotKnots algorithm has been implemented in C/C++ and is available from http://www.cs.ubc.ca/labs/beta/Software/HotKnots.  相似文献   

4.
MOTIVATION: In the maximum parsimony (MP) method, the tree requiring the minimum number of changes (discrepancy) to explain the given set of DNA or amino acid sequences is chosen to represent their evolutionary relationships. To find the MP tree, the branch-and-bound algorithm is normally used. For a partial phylogenetic-tree (one that has a subset of the organisms) the traditional algorithm assigns a cost equal to the discrepancy of the partial phylogenetic-tree. We propose a single column discrepancy heuristic which increases this cost by predicting a minimum additional discrepancy needed to attach the sequences yet to be added to the partial phylogenetic-tree. A dynamic Max-mini order of sequence addition is also proposed to quickly terminate branch-and-bound search paths that are guaranteed to lead to suboptimal solutions. RESULTS: We studied the running time of 47 problems generated from 17 data sets. The use of single column discrepancy heuristic speeded up the computation to 2.4-fold for static and 18.2-fold for dynamic search order. The improvement appeared to increase exponentially with the number of sequences. The proposed strategies are also likely to be useful in speeding up the MP tree search using heuristic searches that are based on branch-and-bound-like algorithms.CONTACT: s.kumar@asu.edu  相似文献   

5.
MOTIVATION: Multiple sequence alignment is an important tool in computational biology. In order to solve the task of computing multiple alignments in affordable time, the most commonly used multiple alignment methods have to use heuristics. Nevertheless, the computation of optimal multiple alignments is important in its own right, and it provides a means of evaluating heuristic approaches or serves as a subprocedure of heuristic alignment methods. RESULTS: We present an algorithm that uses the divide-and-conquer alignment approach together with recent results on search space reduction to speed up the computation of multiple sequence alignments. The method is adaptive in that depending on the time one wants to spend on the alignment, a better, up to optimal alignment can be obtained. To speed up the computation in the optimal alignment step, we apply the alpha(*) algorithm which leads to a procedure provably more efficient than previous exact algorithms. We also describe our implementation of the algorithm and present results showing the effectiveness and limitations of the procedure.  相似文献   

6.
A parallel genetic algorithm for optimization is outlined, and its performance on both mathematical and biomechanical optimization problems is compared to a sequential quadratic programming algorithm, a downhill simplex algorithm and a simulated annealing algorithm. When high-dimensional non-smooth or discontinuous problems with numerous local optima are considered, only the simulated annealing and the genetic algorithm, which are both characterized by a weak search heuristic, are successful in finding the optimal region in parameter space. The key advantage of the genetic algorithm is that it can easily be parallelized at negligible overhead.  相似文献   

7.
A genetic algorithm simulating Darwinian evolution is proposed to yield near-optimal solutions to the Traveling Salesman Problem. Noting that Darwinian evolution is itself an optimization process, we propose a heuristic algorithm that incorporates the tenets of natural selection. The time complexity of this algorithm is equivalent to the fastest sorting scheme! Computer simulations indicate rapid convergence is maintained even with increasing problem complexity. This methodology can be adapted to tackle a host of other combinatorial problems.  相似文献   

8.
As sequencing techniques become increasingly efficient, the average length of a sequence is bound to grow. Traditional sequence-comparison algorithms can either compare DNA or protein, but not a mixture, which is actually a common situation. Most obtained DNA sequences contain coding regions, and it is more reliable to compare the coding regions as protein than just as DNA.A heuristic algorithm is presented that can compare DNA with both coding and noncoding regions, but that also can compare multiple reading frames and determine which exons are homologous.A program, GenAl (Genomic Alignment), was developed that implements the algorithm. Its use is demonstrated on two retroviruses. Correspondence to: J. Hein  相似文献   

9.
MOTIVATION: The problem of reconstructing full sibling groups from DNA marker data remains a significant challenge for computational biology. A recently published heuristic algorithm based on Mendelian exclusion rules and the Simpson index was successfully applied to the full sibship reconstruction (FSR) problem. However, the so-called SIMPSON algorithm has an unknown complexity measure, questioning its applicability range. RESULTS: We present a modified version of the SIMPSON (MS) algorithm that behaves as O(n(3)) and achieves the same or better accuracy when compared with the original algorithm. Performance of the MS algorithm was tested on a variety of simulated diploid population samples to verify its complexity measure and the significant improvement in efficiency (e.g. 100 times faster than SIMPSON in some cases). It has been shown that, in theory, the SIMPSON algorithm runs in non-polynomial time, significantly limiting its usefulness. It has been also verified via simulation experiments that SIMPSON could run in O(n(a)), where a > 3. AVAILABILITY: Computer code written in Java is available upon request from the first author. CONTACT: Dmitry.Konovalov@jcu.edu.au.  相似文献   

10.
基于启发式A^*算法的超声图像颈动脉内膜提取   总被引:1,自引:0,他引:1  
从超声图像准确提取颈动脉内膜,为基于颈动脉超声图像判断动脉粥样硬化服务。方法提出一种基于启发式A*算法从超声图像中提取颈动脉内膜边缘的方法。先使用图像分割法区分血管腔和血管壁,再采用结合图像灰度值特点的A*算法准确地提取颈动脉内膜边缘。结果通过对临床采集的32幅颈动脉超声图像的分析研究,表明本方法自动提取的结果与医生手工描绘的结果基本吻合。结论本方法有望应用于超声图像颈动脉内膜的自动提取。  相似文献   

11.
HCPM is a tool for clustering protein structures from comparative modeling, ab initio structure prediction, etc. A hierarchical clustering algorithm is designed and tested, and a heuristic is provided for an optimal cluster selection. The method has been successfully tested during the CASP6 experiment.  相似文献   

12.
张林 《生物信息学》2014,12(3):179-184
为探索准确、高效、低成本、通用性并存的生物序列局部比对方法。将点阵图算法、启发式算法等各种序列局部比对算法中准确性最高的动态规划局部比对算法在计算机中实现,并通过流式模型将其映射到图形硬件上以实现算法加速,再通过实例比对搜索数据库完成比对时间和每秒百万次格点更新(MCUPS)性能值评测。结果表明,该加速算法在保证比对准确性的同时,能显著提升比对速度。与目前最快的启发式算法相比,比对平均加速为14.5倍,最高加速可达22.9倍。  相似文献   

13.
Fault Tolerant Contact Map Reconstruction (FT-COMAR) is a heuristic algorithm for the reconstruction of the protein three-dimensional structure from (possibly) incomplete (i.e. containing unknown entries) and noisy contact maps. FT-COMAR runs within minutes, allowing its application to a large-scale number of predictions. AVAILABILITY: http://bioinformatics.cs.unibo.it/FT-COMAR  相似文献   

14.
15.
L Wernisch  M Hunting  S J Wodak 《Proteins》1999,35(3):338-352
A novel automatic procedure for identifying domains from protein atomic coordinates is presented. The procedure, termed STRUDL (STRUctural Domain Limits), does not take into account information on secondary structures and handles any number of domains made up of contiguous or non-contiguous chain segments. The core algorithm uses the Kernighan-Lin graph heuristic to partition the protein into residue sets which display minimum interactions between them. These interactions are deduced from the weighted Voronoi diagram. The generated partitions are accepted or rejected on the basis of optimized criteria, representing basic expected physical properties of structural domains. The graph heuristic approach is shown to be very effective, it approximates closely the exact solution provided by a branch and bound algorithm for a number of test proteins. In addition, the overall performance of STRUDL is assessed on a set of 787 representative proteins from the Protein Data Bank by comparison to domain definitions in the CATH protein classification. The domains assigned by STRUDL agree with the CATH assignments in at least 81% of the tested proteins. This result is comparable to that obtained previously using PUU (Holm and Sander, Proteins 1994;9:256-268), the only other available algorithm designed to identify domains with any number of non-contiguous chain segments. A detailed discussion of the structures for which our assignments differ from those in CATH brings to light some clear inconsistencies between the concept of structural domains based on minimizing inter-domain interactions and that of delimiting structural motifs that represent acceptable folding topologies or architectures. Considering both concepts as complementary and combining them in a layered approach might be the way forward.  相似文献   

16.
A complete set of software tools to aid the physical mapping of a genome has been developed and successfully applied to the genomic mapping of the fission yeast Schizosaccharomyces pombe. Two approaches were used for ordering single-copy hybridisation probes: one was based on the simulated annealing algorithm to order all probes, and another on inferring the minimum-spanning subset of the probes using a heuristic filtering procedure. Both algorithms produced almost identical maps, with minor differences in the order of repetitive probes and those having identical hybridisation patterns. A separate algorithm fitted the clones to the established probe order. Approaches for handling experimental noise and repetitive elements are discussed. In addition to these programs and the database management software, tools for visualizing and editing the data are described. The issues of combining the information from different libraries are addressed. Also, ways of handling multiple-copy probes and non-hybridisation data are discussed.  相似文献   

17.
Clustering gene expression patterns.   总被引:23,自引:0,他引:23  
Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the analysis of gene expression data is the detection of groups of genes that manifest similar expression patterns. The corresponding algorithmic problem is to cluster multicondition gene expression patterns. In this paper we describe a novel clustering algorithm that was developed for analysis of gene expression data. We define an appropriate stochastic error model on the input, and prove that under the conditions of the model, the algorithm recovers the cluster structure with high probability. The running time of the algorithm on an n-gene dataset is O[n2[log(n)]c]. We also present a practical heuristic based on the same algorithmic ideas. The heuristic was implemented and its performance is demonstrated on simulated data and on real gene expression data, with very promising results.  相似文献   

18.
本文提出一种新的基于重连接方法的无标度网络构建算法.根据重连接方法新节点的调控节点会被重选,重连接概率取决于幂率分布模型参数gamma.用本文算法构建的网络通过微分方程模型来模拟基因表达谱数据,所用的优化算法为GA与PSO.候选节点的选择可以根据已有节点的连接数决定.实验的网络可以用log-log图,模拟的基因表达谱也用微分方程模型来验证效果.每个连接的正确性将会通过实验验证,完整的程序可以通过我们的官方网站获得:http://ccst.jlu.edu.cn/CSBG/ourown/.  相似文献   

19.
MOTIVATION: There is currently much interest in reverse-engineering regulatory relationships between genes from microarray expression data. We propose a new algorithmic method for inferring such interactions between genes using data from gene knockout experiments. The algorithm we use is the Sparse Bayesian regression algorithm of Tipping and Faul. This method is highly suited to this problem as it does not require the data to be discretized, overcomes the need for an explicit topology search and, most importantly, requires no heuristic thresholding of the discovered connections. RESULTS: Using simulated expression data, we are able to show that this algorithm outperforms a recently published correlation-based approach. Crucially, it does this without the need to set any ad hoc threshold on possible connections.  相似文献   

20.
MOTIVATION: Deciphering the location of gene duplications and multiple gene duplication episodes on the Tree of Life is fundamental to understanding the way gene families and genomes evolve. The multiple gene duplication problem provides a framework for placing gene duplication events onto nodes of a given species tree, and detecting episodes of multiple gene duplication. One version of the multiple gene duplication problem was defined by Guigó et al. in 1996. Several heuristic solutions have since been proposed for this problem, but no exact algorithms were known. RESULTS: In this article we solve this longstanding open problem by providing the first exact and efficient solution. We also demonstrate the improvement offered by our algorithm over the best heuristic approaches, by applying it to several simulated as well as empirical datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号