首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MOTIVATION: Bioinformatics requires Grid technologies and protocols to build high performance applications without focusing on the low level detail of how the individual Grid components operate. RESULTS: The Discovery Net system is a middleware that allows service developers to integrate tools based on existing and emerging Grid standards such as web services. Once integrated, these tools can be used to compose reusable workflows using these services that can later be deployed as new services for others to use. Using the Discovery Net system and a range of different bioinformatics tools, we built a Grid based application for Genome Annotation. This includes workflows for automatic nucleotide annotation, annotation of predicted proteins and text analysis based on metabolic profiles and text analysis.  相似文献   

2.
MOTIVATION: The (my)Grid project aims to exploit Grid technology, with an emphasis on the Information Grid, and provide middleware layers that make it appropriate for the needs of bioinformatics. (my)Grid is building high level services for data and application integration such as resource discovery, workflow enactment and distributed query processing. Additional services are provided to support the scientific method and best practice found at the bench but often neglected at the workstation, notably provenance management, change notification and personalisation. RESULTS: We give an overview of these services and their metadata. In particular, semantically rich metadata expressed using ontologies necessary to discover, select and compose services into dynamic workflows.  相似文献   

3.
A key problem in executing performance critical applications on distributed computing environments (e.g. the Grid) is the selection of resources. Research related to “automatic resource selection” aims to allocate resources on behalf of users to optimize the execution performance. However, most of current approaches are based on the static principle (i.e. resource selection is performed prior to execution) and need detailed application-specific information. In the paper, we introduce a novel on-line automatic resource selection approach. This approach is based on a simple control theory: the application continuously reports the Execution Satisfaction Degree (ESD) to the middleware Application Agent (AA), which relies on the reported ESD values to learn the execution behavior and tune the computing environment by adding/replacing/deleting resources during the execution in order to satisfy users’ performance requirements. We introduce two different policies applied to this approach to enable the AA to learn and tune the computing environment: the Utility Classification policy and the Desired Processing Power Estimation (DPPE) policy. Each policy is validated by an iterative application and a non-iterative application to demonstrate that both policies are effective to support most kinds of applications.  相似文献   

4.
With the proliferation of Quad/Multi-core micro-processors in mainstream platforms such as desktops and workstations; a large number of unused CPU cycles can be utilized for running virtual machines (VMs) as dynamic nodes in distributed environments. Grid services and its service oriented business broker now termed cloud computing could deploy image based virtualization platforms enabling agent based resource management and dynamic fault management. In this paper we present an efficient way of utilizing heterogeneous virtual machines on idle desktops as an environment for consumption of high performance grid services. Spurious and exponential increases in the size of the datasets are constant concerns in medical and pharmaceutical industries due to the constant discovery and publication of large sequence databases. Traditional algorithms are not modeled at handing large data sizes under sudden and dynamic changes in the execution environment as previously discussed. This research was undertaken to compare our previous results with running the same test dataset with that of a virtual Grid platform using virtual machines (Virtualization). The implemented architecture, A3pviGrid utilizes game theoretic optimization and agent based team formation (Coalition) algorithms to improve upon scalability with respect to team formation. Due to the dynamic nature of distributed systems (as discussed in our previous work) all interactions were made local within a team transparently. This paper is a proof of concept of an experimental mini-Grid test-bed compared to running the platform on local virtual machines on a local test cluster. This was done to give every agent its own execution platform enabling anonymity and better control of the dynamic environmental parameters. We also analyze performance and scalability of Blast in a multiple virtual node setup and present our findings. This paper is an extension of our previous research on improving the BLAST application framework using dynamic Grids on virtualization platforms such as the virtual box.  相似文献   

5.
随着分子生物信息数据量高速增长,生物信息学面临着大规模、高通量、密集型计算的巨大挑战。为有效利用计算机资源,缩短高通量生物信息计算程序执行时间,我们基于Globus Toolkit网格中间件,实现了一个支持高通量生物数据计算的网格系统(Biological Data Computing Grid,简称BDCGrid)。BDCGrid计算网格系统模型可以有效整合中小型生物信息学实验室计算机资源,大大缩短高通量生物信息计算程序执行时间,为相关研究人员利用现有计算机资源处理大规模、高通量生物信息计算任务提供一种新的途径。  相似文献   

6.
Cactus Tools for Grid Applications   总被引:3,自引:0,他引:3  
Cactus is an open source problem solving environment designed for scientists and engineers. Its modular structure facilitates parallel computation across different architectures and collaborative code development between different groups. The Cactus Code originated in the academic research community, where it has been developed and used over many years by a large international collaboration of physicists and computational scientists. We discuss here how the intensive computing requirements of physics applications now using the Cactus Code encourage the use of distributed and metacomputing, and detail how its design makes it an ideal application test-bed for Grid computing. We describe the development of tools, and the experiments which have already been performed in a Grid environment with Cactus, including distributed simulations, remote monitoring and steering, and data handling and visualization. Finally, we discuss how Grid portals, such as those already developed for Cactus, will open the door to global computing resources for scientific users.  相似文献   

7.
Modelling and simulation of complex cellular transactions involve development of platforms that understand diverse mathematical representations and are capable of handling large backend computations. Grid Cellware, an integrated modelling and simulation tool, has been developed to precisely address these niche requirements of the modelling community. Grid Cellware implements various pathway simulation algorithms along with adaptive Swarm algorithm for parameter estimation. For enchanced computational productivity Grid Cellware uses grid technology with Globus as the middleware.  相似文献   

8.
To deal with the environment’s heterogeneity, information providers usually offer access to their data by publishing Web services in the domain of pervasive computing. Therefore, to support applications that need to combine data from a diverse range of sources, pervasive computing requires a middleware to query multiple Web services. There exist works that have been investigating on generating optimal query plans. We however in this paper propose a query execution model, called PQModel, to optimize the process of query execution over Web Services. In other words, we attempt to improve query efficiency from the aspect of optimizing the execution processing of query plans.  相似文献   

9.
For several applications and algorithms used in applied bioinformatics, a bottle neck in terms of computational time may arise when scaled up to facilitate analyses of large datasets and databases. Re-codification, algorithm modification or sacrifices in sensitivity and accuracy may be necessary to accommodate for limited computational capacity of single work stations. Grid computing offers an alternative model for solving massive computational problems by parallel execution of existing algorithms and software implementations. We present the implementation of a Grid-aware model for solving computationally intensive bioinformatic analyses exemplified by a blastp sliding window algorithm for whole proteome sequence similarity analysis, and evaluate the performance in comparison with a local cluster and a single workstation. Our strategy involves temporary installations of the BLAST executable and databases on remote nodes at submission, accommodating for dynamic Grid environments as it avoids the need of predefined runtime environments (preinstalled software and databases at specific Grid-nodes). Importantly, the implementation is generic where the BLAST executable can be replaced by other software tools to facilitate analyses suitable for parallelisation. This model should be of general interest in applied bioinformatics. Scripts and procedures are freely available from the authors.  相似文献   

10.
File and Object Replication in Data Grids   总被引:23,自引:0,他引:23  
Data replication is a key issue in a Data Grid and can be managed in different ways and at different levels of granularity: for example, at the file level or object level. In the High Energy Physics community, Data Grids are being developed to support the distributed analysis of experimental data. We have produced a prototype data replication tool, the Grid Data Mirroring Package (GDMP) that is in production use in one physics experiment, with middleware provided by the Globus Toolkit used for authentication, data movement, and other purposes. We present here a new, enhanced GDMP architecture and prototype implementation that uses Globus Data Grid tools for efficient file replication. We also explain how this architecture can address object replication issues in an object-oriented database management system. File transfer over wide-area networks requires specific performance tuning in order to gain optimal data transfer rates. We present performance results obtained with GridFTP, an enhanced version of FTP, and discuss tuning parameters.  相似文献   

11.
Diagnostic surgical pathology or tissue–based diagnosis still remains the most reliable and specific diagnostic medical procedure. The development of whole slide scanners permits the creation of virtual slides and to work on so-called virtual microscopes. In addition to interactive work on virtual slides approaches have been reported that introduce automated virtual microscopy, which is composed of several tools focusing on quite different tasks. These include evaluation of image quality and image standardization, analysis of potential useful thresholds for object detection and identification (segmentation), dynamic segmentation procedures, adjustable magnification to optimize feature extraction, and texture analysis including image transformation and evaluation of elementary primitives. Grid technology seems to possess all features to efficiently target and control the specific tasks of image information and detection in order to obtain a detailed and accurate diagnosis. Grid technology is based upon so-called nodes that are linked together and share certain communication rules in using open standards. Their number and functionality can vary according to the needs of a specific user at a given point in time. When implementing automated virtual microscopy with Grid technology, all of the five different Grid functions have to be taken into account, namely 1) computation services, 2) data services, 3) application services, 4) information services, and 5) knowledge services. Although all mandatory tools of automated virtual microscopy can be implemented in a closed or standardized open system, Grid technology offers a new dimension to acquire, detect, classify, and distribute medical image information, and to assure quality in tissue–based diagnosis.  相似文献   

12.
ABSTRACT: BACKGROUND: Laboratories engaged in computational biology or bioinformatics frequently need to run lengthy, multistep, and user-driven computational jobs. Each job can tie up a computer for a few minutes to several days, and many laboratories lack the expertise or resources to build and maintain a dedicated computer cluster. RESULTS: JobCenter is a client-server application and framework for job management and distributed job execution. The client and server components are both written in Java and are cross-platform and relatively easy to install. All communication with the server is client-driven, which allows worker nodes to run anywhere (even behind external firewalls) and provides inherent load balancing. Adding a worker node to the worker pool is as simple as dropping the JobCenter client files onto any computer and performing basic configuration, providing tremendous ease-of-use, flexibility, and limitless horizontal scalability. Each worker installation may be independently configured, including the types of jobs it is able to run. Executed jobs may be written in any language and may include multiple execution steps. CONCLUSIONS: JobCenter is a versatile and scalable distributed job management system that allows laboratories to very efficiently distribute all computational work among all available resources. JobCenter is freely available at http://code.google.com/p/jobcenter/.  相似文献   

13.
Grid Portals, based on standard web technologies, are emerging as important and useful user interfaces to computational and data Grids. Grid Portals enable Virtual Organizations, comprised of distributed researchers to collaborate and access resources more efficiently and seamlessly. The Astrophysics Simulation Collaboratory (ASC) Grid Portal provides a framework to enable researchers in the field of numerical relativity to study astrophysical phenomenon by making use of the Cactus computational toolkit. We examine user requirements and describe the design and implementation of the ASC Grid Portal.  相似文献   

14.

Background  

Microarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible for serious diseases, with a great scientific impact and a wide application area. Several standalone applications had been developed in order to analyze microarray data. Two of the most known free analysis software packages are the R-based Bioconductor and dChip. The part of dChip software concerning the calculation and the analysis of gene expression has been modified to permit its execution on both cluster environments (supercomputers) and Grid infrastructures (distributed computing).  相似文献   

15.
The LTER Grid Pilot Study was conducted by the National Center for Supercomputing Applications, the University of New Mexico, and Michigan State University, to design and build a prototype grid for the ecological community. The featured grid application, the Biophony Grid Portal, manages acoustic data from field sensors and allows researchers to conduct real-time digital signal processing analysis on high-performance systems via a web-based portal. Important characteristics addressed during the study include the management, access, and analysis of a large set of field collected acoustic observations from microphone sensors, single signon, and data provenance. During the development phase of this project, new features were added to standard grid middleware software and have already been successfully leveraged by other, unrelated grid projects. This paper provides an overview of the Biophony Grid Portal application and requirements, discusses considerations regarding grid architecture and design, details the technical implementation, and summarizes key experiences and lessons learned that are generally applicable to all developers and administrators in a grid environment.  相似文献   

16.
Distributed systems based on cluster of workstation are more and more difficult to manage due to the increasing number of processors involved, and the complexity of associated applications. Such systems need efficient and flexible monitoring mechanisms to fulfill administration services requirements. In this paper, we present PHOENIX a distributed platform supporting both applications and operating system monitoring with a variable granularity. The granularity is defined using logical expressions to specify complex monitoring conditions. These conditions can be dynamically modified during the application execution. Observation techniques, based on an automatic probe insertion combined with a system agent to minimize the PHOENIX execution time overhead. The platform extensibility offers a suitable environment to design distributed value added services (performance monitoring, load balancing, accounting, cluster management, etc.).  相似文献   

17.
Cheng  Feng  Huang  Yifeng  Tanpure  Bhavana  Sawalani  Pawan  Cheng  Long  Liu  Cong 《Cluster computing》2022,25(1):619-631

As the services provided by cloud vendors are providing better performance, achieving auto-scaling, load-balancing, and optimized performance along with low infrastructure maintenance, more and more companies migrate their services to the cloud. Since the cloud workload is dynamic and complex, scheduling the jobs submitted by users in an effective way is proving to be a challenging task. Although a lot of advanced job scheduling approaches have been proposed in the past years, almost all of them are designed to handle batch jobs rather than real-time workloads, such as that user requests are submitted at any time with any amount of numbers. In this work, we have proposed a Deep Reinforcement Learning (DRL) based job scheduler that dispatches the jobs in real time to tackle this problem. Specifically, we focus on scheduling user requests in such a way as to provide the quality of service (QoS) to the end-user along with a significant reduction of the cost spent on the execution of jobs on the virtual instances. We have implemented our method by Deep Q-learning Network (DQN) model, and our experimental results demonstrate that our approach can significantly outperform the commonly used real-time scheduling algorithms.

  相似文献   

18.
Fault tolerance in parallel systems has traditionally been achieved through a combination of redundancy and checkpointing methods. This notion has also been extended to message-passing systems with user-transparent process checkpointing and message logging. Furthermore, studies of multiple types of rollback and recovery have been reported in literature, ranging from communication-induced checkpointing to pessimistic and synchronous solutions. However, many of these solutions incorporate high overhead because of their inability to utilize application level information.This paper describes the design and implementation of MPI/FT, a high-performance MPI-1.2 implementation enhanced with low-overhead functionality to detect and recover from process failures. The strategy behind MPI/FT is that fault tolerance in message-passing middleware can be optimized based on an application's execution model derived from its communication topology and parallel programming semantics. MPI/FT exploits the specific characteristics of two parallel application execution models in order to optimize performance. MPI/FT also introduces the self-checking thread that monitors the functioning of the middleware itself. User aware checkpointing and user-assisted recovery are compatible with MPI/FT and complement the techniques used here.This paper offers a classification of MPI applications for fault tolerant MPI purposes and MPI/FT implementation discussed here provides different middleware versions specifically tailored to each of the two models studied in detail. The interplay of various parameters affecting the cost of fault tolerance is investigated. Experimental results demonstrate that the approach used to design and implement MPI/FT results in a low-overhead MPI-based fault tolerant communication middleware implementation.  相似文献   

19.
MOTIVATION: Grid computing is used to solve large-scale bioinformatics problems with gigabytes database by distributing the computation across multiple platforms. Until now in developing bioinformatics grid applications, it is extremely tedious to design and implement the component algorithms and parallelization techniques for different classes of problems, and to access remotely located sequence database files of varying formats across the grid. In this study, we propose a grid programming toolkit, GLAD (Grid Life sciences Applications Developer), which facilitates the development and deployment of bioinformatics applications on a grid. RESULTS: GLAD has been developed using ALiCE (Adaptive scaLable Internet-based Computing Engine), a Java-based grid middleware, which exploits the task-based parallelism. Two bioinformatics benchmark applications, such as distributed sequence comparison and distributed progressive multiple sequence alignment, have been developed using GLAD.  相似文献   

20.
Development of NPACI Grid Application Portals and Portal Web Services   总被引:2,自引:0,他引:2  
Grid portals and services are emerging as convenient mechanisms for providing the scientific community with familiar and simplified interfaces to the Grid. Our experiences in implementing computational grid portals, and the services needed to support them, has led to the creation of GridPort: a unique, integrated, layered software system for building portals and hosting portal services that access Grid services. The usefulness of this system has been successfully demonstrated with the implementation of several application portals. This system has several unique features: the software is portable and runs on most webservers; written in Perl/CGI, it is easy to support and modify; a single API provides access to a host of Grid services; it is flexible and adaptable; it supports single login between multiple portals; and portals built with it may run across multiple sites and organizations. In this paper we summarize our experiences in building this system, including philosophy and design choices and we describe the software we are building that support portal development, portal services. Finally, we discuss our experiences in developing the GridPort Client Toolkit in support of remote Web client portals and Grid Web services.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号