首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
Markov clustering (MCL) is becoming a key algorithm within bioinformatics for determining clusters in networks. However,with increasing vast amount of data on biological networks, performance and scalability issues are becoming a critical limiting factor in applications. Meanwhile, GPU computing, which uses CUDA tool for implementing a massively parallel computing environment in the GPU card, is becoming a very powerful, efficient, and low-cost option to achieve substantial performance gains over CPU approaches. The use of on-chip memory on the GPU is efficiently lowering the latency time, thus, circumventing a major issue in other parallel computing environments, such as MPI. We introduce a very fast Markov clustering algorithm using CUDA (CUDA-MCL) to perform parallel sparse matrix-matrix computations and parallel sparse Markov matrix normalizations, which are at the heart of MCL. We utilized ELLPACK-R sparse format to allow the effective and fine-grain massively parallel processing to cope with the sparse nature of interaction networks data sets in bioinformatics applications. As the results show, CUDA-MCL is significantly faster than the original MCL running on CPU. Thus, large-scale parallel computation on off-the-shelf desktop-machines, that were previously only possible on supercomputing architectures, can significantly change the way bioinformaticians and biologists deal with their data.  相似文献   

2.
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate-limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper, we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching), is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA, which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment.  相似文献   

3.
We investigated the effects of soluble epoxide hydrolase (sEH) inhibition on epoxyeicosatrienoic acid (EET) metabolism in intact human blood vessels, including the human saphenous vein (HSV), coronary artery (HCA), and aorta (HA). When HSV segments were perfused with 2 micromol/l 14,15-[3H]EET for 4 h, >60% of radioactivity in the perfusion medium was converted to 14,15-dihydroxyeicosatrienoic acid (DHET). Similar results were obtained with endothelium-denuded vessels. 14,15-DHET was released from both the luminal and adventitial surfaces of the HSV. When HSVs were incubated with 14,15-[3H]EET under static (no flow) conditions, formation of 14,15-DHET was detected within 15 min and was inhibited by the selective sEH inhibitors N,N'-dicyclohexyl urea and N-cyclohexyl-N'-dodecanoic acid urea (CUDA). Similarly, CUDA inhibited the conversion of 11,12-[3H]EET to 11,12-DHET by the HSV. sEH inhibition enhanced the uptake of 14,15-[3H]EET and facilitated the formation of 10,11-epoxy-16:2, a beta-oxidation product. The HCA and HA converted 14,15-[3H]EET to DHET, and this also was inhibited by CUDA. These findings in intact human blood vessels indicate that conversion to DHET is the predominant pathway for 11,12- and 14,15-EET metabolism and that sEH inhibition can modulate EET metabolism in vascular tissue.  相似文献   

4.
Hybrid functional Petri nets are a wide-spread tool for representing and simulating biological models. Due to their potential of providing virtual drug testing environments, biological simulations have a growing impact on pharmaceutical research. Continuous research advancements in biology and medicine lead to exponentially increasing simulation times, thus raising the demand for performance accelerations by efficient and inexpensive parallel computation solutions. Recent developments in the field of general-purpose computation on graphics processing units (GPGPU) enabled the scientific community to port a variety of compute intensive algorithms onto the graphics processing unit (GPU). This work presents the first scheme for mapping biological hybrid functional Petri net models, which can handle both discrete and continuous entities, onto compute unified device architecture (CUDA) enabled GPUs. GPU accelerated simulations are observed to run up to 18 times faster than sequential implementations. Simulating the cell boundary formation by Delta-Notch signaling on a CUDA enabled GPU results in a speedup of approximately 7x for a model containing 1,600 cells.  相似文献   

5.
The opportunities to apply the CUDA technology for faster computations in structural biology and bioinformatics are reviewed and analyzed. Using HEX 6.1 software, we performed a comparative analysis of the efficiency and the increase in quality after CUDA application. The work was conducted on the example of rigid docking of low-molecular-weight compounds of different classes on the surface of the Arabidopsis thaliana FtsZ. Several potential binding sites of benzimidazoles to the plant FtsZ were identified.  相似文献   

6.
We discuss an implementation of molecular dynamics (MD) simulations on a graphic processing unit (GPU) in the NVIDIA CUDA language. We tested our code on a modern GPU, the NVIDIA GeForce 8800 GTX. Results for two MD algorithms suitable for short-ranged and long-ranged interactions, and a congruential shift random number generator are presented. The performance of the GPU's is compared to their main processor counterpart. We achieve speedups of up to 40, 80 and 150 fold, respectively. With the latest generation of GPU's one can run standard MD simulations at 107 flops/$.  相似文献   

7.
8.
While there has been an increase in the number of biomolecular computational studies employing graphics processing units (GPU), results describing their use with the molecular dynamics package AMBER with the CUDA implementation are scarce. No information is available comparing MD methodologies pmemd.cuda, pmemd.mpi or sander.mpi, available in AMBER, for generalised Born (GB) simulations or with solvated systems. As part of our current studies with antifreeze proteins (AFP), and for the previous reasons, we present details of our experience comparing performance of MD simulations at varied temperatures between multi-CPU runs using sander.mpi, pmemd.mpi and pmemd.cuda with the AFP from the fish ocean pout (1KDF). We found extremely small differences in total energies between multi-CPU and GPU CUDA implementations of AMBER12 in 1ns production simulations of the solvated system using the TIP3P water model. Additionally, GPU computations achieved typical one order of magnitude speedups when using mixed precision but were similar to CPU speeds when computing with double precision. However, we found that GB calculations were highly sensitive to the choice of initial GB parametrisation regardless of the type of methodology, with substantial differences in total energies.  相似文献   

9.
Smoothed Particle Hydrodynamics (SPH) is a numerical method commonly used in Computational Fluid Dynamics (CFD) to simulate complex free-surface flows. Simulations with this mesh-free particle method far exceed the capacity of a single processor. In this paper, as part of a dual-functioning code for either central processing units (CPUs) or Graphics Processor Units (GPUs), a parallelisation using GPUs is presented. The GPU parallelisation technique uses the Compute Unified Device Architecture (CUDA) of nVidia devices. Simulations with more than one million particles on a single GPU card exhibit speedups of up to two orders of magnitude over using a single-core CPU. It is demonstrated that the code achieves different speedups with different CUDA-enabled GPUs. The numerical behaviour of the SPH code is validated with a standard benchmark test case of dam break flow impacting on an obstacle where good agreement with the experimental results is observed. Both the achieved speed-ups and the quantitative agreement with experiments suggest that CUDA-based GPU programming can be used in SPH methods with efficiency and reliability.  相似文献   

10.
The results of theoretical studies of the structural and dynamic features of peptides and small proteins have been presented that were carried out by quantum chemical and molecular dynamics methods in high-performance graphic stations, “table supercomputers,” using distributed calculations by the CUDA technology.  相似文献   

11.
This article provides review and analysis of opportunities for application of the CUDA technology for acceleration of computations in structural biology and bioinformatics. On the example of work with the Hex 6.1 program, comparative analysis of increase in the speed and quality of results of hard-docking of a number of low-molecular compounds on the surface of the FtsZ protein from Arabidopsis thaliana was performed. Several potential benzimidazole--plant FtsZ protein binding sites were identified.  相似文献   

12.
The key to achieving high performance on a GPU-enhanced cluster is efficient exploitation of each GPU’s powerful computing capability. Moreover, rationally balancing the workload between CPUs and GPUs can release additional computing power, which arises from the CPUs. In this paper, we extend our earlier work on using a hybrid CPU-GPU cluster for real-world sedimentary basin simulation, by further improving the involved CUDA implementations. A thorough analysis of the achieved new performance is also carried out. By using 1024 GPUs and 12288 CPU cores together, our best CPU-GPU hybrid implementation is able to achieve a double-precision performance of 72.8 TFlops, in connection with simulations on a huge 131072×131072 mesh.  相似文献   

13.
With the performance of central processing units (CPUs) having effectively reached a limit, parallel processing offers an alternative for applications with high computational demands. Modern graphics processing units (GPUs) are massively parallel processors that can execute simultaneously thousands of light-weight processes. In this study, we propose and implement a parallel GPU-based design of a popular method that is used for the analysis of brain magnetic resonance imaging (MRI). More specifically, we are concerned with a model-based approach for extracting tissue structural information from diffusion-weighted (DW) MRI data. DW-MRI offers, through tractography approaches, the only way to study brain structural connectivity, non-invasively and in-vivo. We parallelise the Bayesian inference framework for the ball & stick model, as it is implemented in the tractography toolbox of the popular FSL software package (University of Oxford). For our implementation, we utilise the Compute Unified Device Architecture (CUDA) programming model. We show that the parameter estimation, performed through Markov Chain Monte Carlo (MCMC), is accelerated by at least two orders of magnitude, when comparing a single GPU with the respective sequential single-core CPU version. We also illustrate similar speed-up factors (up to 120x) when comparing a multi-GPU with a multi-CPU implementation.  相似文献   

14.
Recent developments in modern computational accelerators like Graphics Processing Units (GPUs) and coprocessors provide great opportunities for making scientific applications run faster than ever before. However, efficient parallelization of scientific code using new programming tools like CUDA requires a high level of expertise that is not available to many scientists. This, plus the fact that parallelized code is usually not portable to different architectures, creates major challenges for exploiting the full capabilities of modern computational accelerators. In this work, we sought to overcome these challenges by studying how to achieve both automated parallelization using OpenACC and enhanced portability using OpenCL. We applied our parallelization schemes using GPUs as well as Intel Many Integrated Core (MIC) coprocessor to reduce the run time of wave propagation simulations. We used a well-established 2D cardiac action potential model as a specific case-study. To the best of our knowledge, we are the first to study auto-parallelization of 2D cardiac wave propagation simulations using OpenACC. Our results identify several approaches that provide substantial speedups. The OpenACC-generated GPU code achieved more than speedup above the sequential implementation and required the addition of only a few OpenACC pragmas to the code. An OpenCL implementation provided speedups on GPUs of at least faster than the sequential implementation and faster than a parallelized OpenMP implementation. An implementation of OpenMP on Intel MIC coprocessor provided speedups of with only a few code changes to the sequential implementation. We highlight that OpenACC provides an automatic, efficient, and portable approach to achieve parallelization of 2D cardiac wave simulations on GPUs. Our approach of using OpenACC, OpenCL, and OpenMP to parallelize this particular model on modern computational accelerators should be applicable to other computational models of wave propagation in multi-dimensional media.  相似文献   

15.
In the image handling and image processing areas, most operations can be executed in a pixel-by-pixel or cluster-by-cluster manner. These parallel and simultaneous executions have many benefits, and many researchers showed remarkable improvements. In this paper, we started from a specific and practical image handling and feature extraction sequences. We focused on the detailed design and robust implementation on the modern massively parallel architecture of CUDA. We present the enhanced features of our implementation and their design details. Our final result shows 13 times faster execution speed, in comparison with its previous CPU-based implementation. These methods can be applied to the variety of image manipulation processes.  相似文献   

16.
Scanning protein sequence database is an often repeated task in computational biology and bioinformatics. However, scanning large protein databases, such as GenBank, with popular tools such as BLASTP requires long runtimes on sequential architectures. Due to the continuing rapid growth of sequence databases, there is a high demand to accelerate this task. In this paper, we demonstrate how GPUs, powered by the Compute Unified Device Architecture (CUDA), can be used as an efficient computational platform to accelerate the BLASTP algorithm. In order to exploit the GPU’s capabilities for accelerating BLASTP, we have used a compressed deterministic finite state automaton for hit detection as well as a hybrid parallelization scheme. Our implementation achieves speedups up to 10.0 on an NVIDIA GeForce GTX 295 GPU compared to the sequential NCBI BLASTP 2.2.22. CUDA-BLASTP source code which is available at https://sites.google.com/site/liuweiguohome/software.  相似文献   

17.
18.
Tensor contractions are generalized multidimensional matrix multiplication operations that widely occur in quantum chemistry. Efficient execution of tensor contractions on Graphics Processing Units (GPUs) requires several challenges to be addressed, including index permutation and small dimension-sizes reducing thread block utilization. Moreover, to apply the same optimizations to various expressions, we need a code generation tool. In this paper, we present our approach to automatically generate CUDA code to execute tensor contractions on GPUs, including management of data movement between CPU and GPU. To evaluate our tool, GPU-enabled code is generated for the most expensive contractions in CCSD(T), a key coupled cluster method, and incorporated into NWChem, a popular computational chemistry suite. For this method, we demonstrate speedup over a factor of 8.4 using one GPU as compared to one CPU core and over 2.6 when utilizing the entire system using hybrid CPU+GPU solution with 2 GPUs and 5 cores (instead of 7 cores per node). We further investigate tensor contraction code on a new series of GPUs, the Fermi GPUs, and provide several effective optimization algorithms. For the same computation of CCSD(T), on a cluster with Fermi GPUs, we achieve a speedup of 3.4 over a cluster with T10 GPUs. With a single Fermi GPU on each node, we achieve a speedup of 43 over the sequential CPU version.  相似文献   

19.
Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.  相似文献   

20.
The first aim of simulation in virtual environment is to help biologists to have a better understanding of the simulated system. The cost of such simulation is significantly reduced compared to that of in vivo simulation. However, the inherent complexity of biological system makes it hard to simulate these systems on non-parallel architectures: models might be made of sub-models and take several scales into account; the number of simulated entities may be quite large. Today, graphics cards are used for general purpose computing which has been made easier thanks to frameworks like CUDA or OpenCL. Parallelization of models may however not be easy: parallel computer programing skills are often required; several hardware architectures may be used to execute models. In this paper, we present the software architecture we built in order to implement various models able to simulate multi-cellular system. This architecture is modular and it implements data structures adapted for graphics processing units architectures. It allows efficient simulation of biological mechanisms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号