首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We investigated the usefulness of a parallel genetic algorithm for phylogenetic inference under the maximum-likelihood (ML) optimality criterion. Parallelization was accomplished by assigning each "individual" in the genetic algorithm "population" to a separate processor so that the number of processors used was equal to the size of the evolving population (plus one additional processor for the control of operations). The genetic algorithm incorporated branch-length and topological mutation, recombination, selection on the ML score, and (in some cases) migration and recombination among subpopulations. We tested this parallel genetic algorithm with large (228 taxa) data sets of both empirically observed DNA sequence data (for angiosperms) as well as simulated DNA sequence data. For both observed and simulated data, search-time improvement was nearly linear with respect to the number of processors, so the parallelization strategy appears to be highly effective at improving computation time for large phylogenetic problems using the genetic algorithm. We also explored various ways of optimizing and tuning the parameters of the genetic algorithm. Under the conditions of our analyses, we did not find the best-known solution using the genetic algorithm approach before terminating each run. We discuss some possible limitations of the current implementation of this genetic algorithm as well as of avenues for its future improvement.  相似文献   

2.
MOTIVATION: Analysis of large biological data sets using a variety of parallel processor computer architectures is a common task in bioinformatics. The efficiency of the analysis can be significantly improved by properly handling redundancy present in these data combined with taking advantage of the unique features of these compute architectures. RESULTS: We describe a generalized approach to this analysis, but present specific results using the program CEPAR, an efficient implementation of the Combinatorial Extension algorithm in a massively parallel (PAR) mode for finding pairwise protein structure similarities and aligning protein structures from the Protein Data Bank. CEPAR design and implementation are described and results provided for the efficiency of the algorithm when run on a large number of processors. AVAILABILITY: Source code is available by contacting one of the authors.  相似文献   

3.
Many computational intensive bioinformatics software, such as multiple sequence alignment, population structure analysis, etc., written in C/C++ are not multicore-aware. A multicore processor is an emerging CPU technology that combines two or more independent processors into a single package. The Single Instruction Multiple Data-stream (SIMD) paradigm is heavily utilized in this class of processors. Nevertheless, most popular compilers including Microsoft Visual C/C++ 6.0, x86 gnu C-compiler gcc do not automatically create SIMD code which can fully utilize the advancement of these processors. To harness the power of the new multicore architecture certain compiler techniques must be considered. This paper presents a generic compiling strategy to assist the compiler in improving the performance of bioinformatics applications written in C/C++. The proposed framework contains 2 main steps: multithreading and vectorizing strategies. After following the strategies, the application can achieve higher speedup by taking the advantage of multicore architecture technology. Due to the extremely fast interconnection networking among multiple cores, it is suggested that the proposed optimization could be more appropriate than making use of parallelization on a small cluster computer which has larger network latency and lower bandwidth.  相似文献   

4.
RNA folding using the massively parallel genetic algorithm (GA) has been enhanced by the addition of a Boltzmann filter. The filter uses the Boltzmann probability distribution in conjunction with Metropolis' relaxation algorithm. The combination of these two concepts within the GA's massively parallel computational environment helps guide the genetic algorithm to more accurately reflect RNA folding pathways and thus final solution structures. Helical regions (base-paired stems) now form in the structures based upon the stochastic properties of the thermodynamic parameters that have been determined from experiments. Thus, structural changes occur based upon the relative energetic impact that the change causes rather than just geometric conflicts alone. As a result, when comparing the predictions to phylogenetically determined structures, over multiple runs, fewer false-positive stems (predicted incorrectly) and more true-positive stems (predicted correctly) are generated, and the total number of predicted stems representing a solution is diminished. In addition, the significance (rate of occurrence) of the true-positive stems is increased. Thus, the predicted results more accurately reflect phylogenetically determined structures.  相似文献   

5.
The massively parallel genetic algorithm (GA) for RNA structure prediction uses the concepts of mutation, recombination, and survival of the fittest to evolve a population of thousands of possible RNA structures toward a solution structure. As described below, the properties of the algorithm are ideally suited to use in the prediction of possible folding pathways and functional intermediates of RNA molecules given their sequences. Utilizing Stem Trace, an interactive visualization tool for RNA structure comparison, analysis of not only the solution ensembles developed by the algorithm, but also the stages of development of each of these solutions, can give strong insight into these folding pathways. The GA allows the incorporation of information from biological experiments, making it possible to test the influence of particular interactions between structural elements on the dynamics of the folding pathway. These methods are used to reveal the folding pathways of the potato spindle tuber viroid (PSTVd) and the host killing mechanism of Escherichia coli plasmid R1, both of which are successfully explored through the combination of the GA and Stem Trace. We also present novel intermediate folds of each molecule, which appear to be phylogenetically supported, as determined by use of the methods described below.  相似文献   

6.
Diffusion limited aggregation (DLA) has proved very successful in modelling systems which display fractal characteristics, like viscous fingering. However, by nature, such simulations are very processor intensive, requiring large amounts of processor time even for relatively small models. We have performed simulations of viscous fingering on the NCUBE parallel computer which has hypercube architecture. We find that, as long as the number of processors used is much less than both the total number of walkers released and the overall dimensions of the model, the fractal dimensions obtained using serial and parallel algorithms give similar results whilst achieving a considerable speed-up in the parallel implementation. An average fractal dimension of 1.71 was obtained along with a speed-up of 106 (in the best case) and 83% efficiency using 128 processors.  相似文献   

7.
This paper investigates scheduling strategies for divisible jobs/loads originating from multiple sites in hierarchical networks with heterogeneous processors and communication channels. In contrast, most previous work in the divisible load scheduling theory (DLT) literature mainly addressed scheduling problems with loads originating from a single processor. This is one of the first works that address scheduling multiple loads from multiple sites in the DLT paradigm. In addition, scheduling multi-site jobs is common in Grids and other general distributed systems for resource sharing and coordination. An efficient static scheduling algorithm PPDD (Processor-set Partitioning and Data Distribution Algorithm) is proposed to near-optimally distribute multiple loads among all processors so that the overall processing time of all jobs is minimized. The PPDD algorithm is applied to two cases: when processors are equipped with front-ends and when they are not equipped with front-ends. The application of the algorithm to homogeneous systems is also studied. Further, several important properties exhibited by the PPDD algorithm are proven through lemmas. To implement the PPDD algorithm, we propose a communication strategy. In addition, we compare the performance of the PPDD algorithm with a Round-robin Scheduling Algorithm (RSA), which is most commonly used. Extensive case studies through numerical analysis have been conducted to verify the theoretical findings.  相似文献   

8.
Abstract

An algorithm is described which allows Nonequilibrium Molecular Dynamics (NEMD) simulations of a fluid undergoing planar Couette flow (shear flow) to be carried out on a distributed memory parallel processor using a (spatial) domain decomposition technique. Unlike previous algorithms, this algorithm uses a co-moving, or Lagrangian, simulation box. Also, the shape of the simulation box changes throughout the course of the simulation. The algorithm, which can be used for two or three dimensional systems, has been tested on a Fujitsu AP1000 Parallel computer with 128 processors.  相似文献   

9.
Ribonucleic Acid (RNA) structures can be viewed as a special kind of strings where characters in a string can bond with each other. The question of aligning two RNA structures has been studied for a while, and there are several successful algorithms that are based upon different models. In this paper, by adopting the model introduced in Wang and Zhang,(19) we propose two algorithms to attack the question of aligning multiple RNA structures. Our methods are to reduce the multiple RNA structure alignment problem to the problem of aligning two RNA structure alignments. Meanwhile, we will show that the framework of sequence center star alignment algorithm can be applied to the problem of multiple RNA structure alignment, and if the triangle inequality is met in the scoring matrix, the approximation ratio of the algorithm remains to be 2-2(over)n, where n is the total number of structures.  相似文献   

10.
We describe an algorithm (IRSA) for identification of common regulatory signals in samples of unaligned DNA sequences. The algorithm was tested on randomly generated sequences of fixed length with implanted signal of length 15 with 4 mutations, and on natural upstream regions of bacterial genes regulated by PurR, ArgR and CRP. Then it was applied to upstream regions of orthologous genes from Escherichia coli and related genomes. Some new palindromic binding and direct repeats signals were identified. Finally we present a parallel version suitable for computers supporting the MPI protocol. This implementation is not strictly bounded by the number of available processors. The computation speed linearly depends on the number of processors.  相似文献   

11.
Simulating protein folding thermodynamics starting purely from a protein sequence is a grand challenge of computational biology. Here, we present an algorithm to calculate a canonical distribution from molecular dynamics simulation of protein folding. This algorithm is based on the replica exchange method where the kinetic trapping problem is overcome by exchanging noninteracting replicas simulated at different temperatures. Our algorithm uses multiplexed-replicas with a number of independent molecular dynamics runs at each temperature. Exchanges of configurations between these multiplexed-replicas are also tried, rendering the algorithm applicable to large-scale distributed computing (i.e., highly heterogeneous parallel computers with processors having different computational power). We demonstrate the enhanced sampling of this algorithm by simulating the folding thermodynamics of a 23 amino acid miniprotein. We show that better convergence is achieved compared to constant temperature molecular dynamics simulation, with an efficient scaling to large number of computer processors. Indeed, this enhanced sampling results in (to our knowledge) the first example of a replica exchange algorithm that samples a folded structure starting from a completely unfolded state.  相似文献   

12.
Parallel computers offer a more cost-effective route to high performance computing than traditional single processor machines. Software for such machines is still in its infancy and they are often much more difficult to program than sequential machines. In addition many of the algorithms which are successful with sequential and vector processors are no longer appropriate. Both the force calculation and integration steps of molecular dynamics are parallel in nature and for that reason we have developed a parallel algorithm based on the link cell technique. This method is particularly efficient when the range of intermolecular potential is much smaller than the dimensions of the simulation box. The details of the algorithm are presented for systems of atoms in two and three dimensions using a number of decompositions into sub-units. The algorithm has been tested on an Intel iPSC/2 and a Cray X-MP/416 and the results are presented for simulations of up to 2 · 106 atoms.  相似文献   

13.
We study the problem of scheduling a divisible load in a three-dimensional mesh of processors. The objective is to find partition of a load into shares and distribution of load shares among processors which minimize load processing time subject to communication delays involved in sending load from one processor to another. We propose a new scheduling algorithm which distributes load in a sequence of stages across the network, each stage brings load to a set of processors located at the same distance from the load source. A key feature of our solution is that sets of processors receive load in the order of decreasing processing capacities. We call this scheduling strategy Largest Layer First. A theorem about the processing time attained by the algorithm is stated. Performance of the algorithm is compared to earlier results.  相似文献   

14.
In this study we apply a genetic algorithm to a set of RNA sequences to find common RNA secondary structures. Our method is a three-step procedure. At the first stage of the procedure for each sequence, a genetic algorithm is used to optimize the structures in a population to a certain degree of stability. In this step, the free energy of a structure is the fitness criterion for the algorithm. Next, for each structure, we define a measure of structural conservation with respect to those in other sequences. We use this measure in a genetic algorithm to improve the structural similarity among sequences for the structures in the population of a sequence. Finally, we select those structures satisfying certain conditions of structural stability and similarity as predicted common structures for a set of RNA sequences. We have obtained satisfactory results from a set of tRNA, 5S rRNA, rev response elements (RRE) of HIV-1 and RRE of HIV-2/SIV, respectively.  相似文献   

15.
SUMMARY: Multiple sequence alignment is a frequently used technique for analyzing sequence relationships. Compilation of large alignments is computationally expensive, but processing time can be considerably reduced when the computational load is distributed over many processors. Parallel processing functionality in the form of single-instruction multiple-data (SIMD) technology was implemented into the multiple alignment program Praline by using 'message passing interface' (MPI) routines. Over the alignments tested here, the parallelized program performed up to ten times faster on 25 processors compared to the single processor version. AVAILABILITY: Example program code for parallelizing pairwise alignment loops is available from http://mathbio.nimr.mrc.ac.uk/~jkleinj/tools/mpicode. The 'message passing interface' package (MPICH) is available from http:/www.unix.mcs.anl.gov/mpi/mpich. CONTACT: jhering@nimr.mrc.ac.uk SUPPLEMENTARY INFORMATION: Praline is accessible at http://mathbio.nimr.mrc.ac.uk/praline.  相似文献   

16.
In this paper, we consider the problem of scheduling divisible loads on arbitrary graphs with the objective to minimize the total processing time of the entire load submitted for processing. We consider an arbitrary graph network comprising heterogeneous processors interconnected via heterogeneous links in an arbitrary fashion. The divisible load is assumed to originate at any processor in the network. We transform the problem into a multi-level unbalanced tree network and schedule the divisible load. We design systematic procedures to identify and eliminate any redundant processor–link pairs (those pairs whose consideration in scheduling will penalize the performance) and derive an optimal tree structure to obtain an optimal processing time, for a fixed sequence of load distribution. Since the algorithm thrives to determine an equivalent number of processors (resources) that can be used for processing the entire load, we refer to this approach as resource-aware optimal load distribution (RAOLD) algorithm. We extend our study by applying the optimal sequencing theorem proposed for single-level tree networks in the literature for multi-level tree for obtaining an optimal solution. We evaluate the performance for a wide range of arbitrary graphs with varying connectivity probabilities and processor densities. We also study the effect of network scalability and connectivity. We demonstrate the time performance when the point of load origination differs in the network and highlight certain key features that may be useful for algorithm and/or network system designers. We evaluate the time performance with rigorous simulation experiments under different system parameters for the ease of a complete understanding.  相似文献   

17.
多序列比对是生物信息学中重要的基础研究内容,对各种RNA序列分析方法而言,这也是非常重要的一步。不像DNA和蛋白质,许多功能RNA分子的序列保守性要远差于其结构的保守性,因此,对RNA的分析研究要求其多序列比对不仅要考虑序列信息,而且要充分考虑到其结构信息。本文提出了一种考虑了结构信息的同源RNA多序列比对算法,它先利用热力学方法计算出每条序列的配对概率矩阵,得到结构信息,由此构造各条序列的结构信息矢量,结合传统序列比对方法,提出优化目标函数,采用动态规划算法和渐进比对得到最后的多序列比对。试验证实该方法的有效性。  相似文献   

18.
Linear regression analysis is considered the least computationally demanding method for mapping quantitative trait loci (QTL). However, simultaneous search for multiple QTL, the use of permutations to obtain empirical significance thresholds, and larger experimental studies significantly increase the computational demand. This report describes an easily implemented parallel algorithm, which significantly reduces the computing time in both QTL mapping and permutation testing. In the example provided, the analysis time was decreased to less than 15% of a single processor system by the use of 18 processors. We indicate how the efficiency of the analysis could be improved by distributing the computations more evenly to the processors and how other ways of distributing the data facilitate the use of more processors. The use of parallel computing in QTL mapping makes it possible to routinely use permutations to obtain empirical significance thresholds for multiple traits and multiple QTL models. It could also be of use to improve the computational efficiency of the more computationally demanding QTL analysis methods.  相似文献   

19.
We present a novel algorithm for the efficient generation of high-quality space-filling molecular graphics that is particularly appropriate for the creation of the large number of images needed in the animation of molecular dynamics. Each atom of the molecule is represented by a sphere of an appropriate radius, and the image of the sphere is constructed pixel-by-pixel using a generalization of the lighting model proposed by Porter (Comp. Graphics 1978, 12, 282). The edges of the spheres are antialiased, and intersections between spheres are handled through a simple blending algorithm that provides very smooth edges. We have implemented this algorithm on a multiprocessor computer using a procedure that dynamically repartitions the effort among the processors based on the CPU time used by each processor to create the previous image. This dynamic reallocation among processors automatically maximizes efficiency in the face of both the changing nature of the image from frame to frame and the shifting demands of the other programs running simultaneously on the same processors. We present data showing the efficiency of this multiprocessing algorithm as the number of processors is increased. The combination of the graphics and multiprocessor algorithms allows the fast generation of many high-quality images.  相似文献   

20.
MPI collective communication operations to distribute or gather data are used for many parallel applications from scientific computing, but they may lead to scalability problems since their execution times increase with the number of participating processors. In this article, we show how the execution time of collective communication operations can be improved significantly by an internal restructuring based on orthogonal processor structures with two or more levels. The execution time of operations like MPI_Bcast() or MPI_Allgather() can be reduced by 40% and 70% on a dual Xeon cluster and a Beowulf cluster with single-processor nodes. But also on a Cray T3E a significant performance improvement can be obtained by a careful selection of the processor structure. The use of these optimized communication operations can reduce the execution time of data parallel implementations of complex application programs significantly without requiring any other change of the computation and communication structure. We present runtime functions for the modeling of two-phase realizations and verify that these runtime functions can predict the execution time both for communication operations in isolation and in the context of application programs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号