首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 609 毫秒
1.
The LTER Grid Pilot Study was conducted by the National Center for Supercomputing Applications, the University of New Mexico, and Michigan State University, to design and build a prototype grid for the ecological community. The featured grid application, the Biophony Grid Portal, manages acoustic data from field sensors and allows researchers to conduct real-time digital signal processing analysis on high-performance systems via a web-based portal. Important characteristics addressed during the study include the management, access, and analysis of a large set of field collected acoustic observations from microphone sensors, single signon, and data provenance. During the development phase of this project, new features were added to standard grid middleware software and have already been successfully leveraged by other, unrelated grid projects. This paper provides an overview of the Biophony Grid Portal application and requirements, discusses considerations regarding grid architecture and design, details the technical implementation, and summarizes key experiences and lessons learned that are generally applicable to all developers and administrators in a grid environment.  相似文献   

2.
MOTIVATION: The complexity of cancer is prompting researchers to find new ways to synthesize information from diverse data sources and to carry out coordinated research efforts that span multiple institutions. There is a need for standard applications, common data models, and software infrastructure to enable more efficient access to and sharing of distributed computational resources in cancer research. To address this need the National Cancer Institute (NCI) has initiated a national-scale effort, called the cancer Biomedical Informatics Grid (caBIGtrade mark), to develop a federation of interoperable research information systems. RESULTS: At the heart of the caBIG approach to federated interoperability effort is a Grid middleware infrastructure, called caGrid. In this paper we describe the caGrid framework and its current implementation, caGrid version 0.5. caGrid is a model-driven and service-oriented architecture that synthesizes and extends a number of technologies to provide a standardized framework for the advertising, discovery, and invocation of data and analytical resources. We expect caGrid to greatly facilitate the launch and ongoing management of coordinated cancer research studies involving multiple institutions, to provide the ability to manage and securely share information and analytic resources, and to spur a new generation of research applications that empower researchers to take a more integrative, trans-domain approach to data mining and analysis. AVAILABILITY: The caGrid version 0.5 release can be downloaded from https://cabig.nci.nih.gov/workspaces/Architecture/caGrid/. The operational test bed Grid can be accessed through the client included in the release, or through the caGrid-browser web application http://cagrid-browser.nci.nih.gov.  相似文献   

3.
One of the challenges of computational-centric research is to make the research undertaken reproducible in a form that others can repeat and re-use with minimal effort. In addition to the data and tools necessary to re-run analyses, execution environments play crucial roles because of the dependencies of the operating system and software version used. However, some of the challenges of reproducible science can be addressed using appropriate computational tools and cloud computing to provide an execution environment.Here, we demonstrate the use of a Kepler scientific workflow for reproducible science that is sharable, reusable, and re-executable. These workflows reduce barriers to sharing and will save researchers time when undertaking similar research in the future.To provide infrastructure that enables reproducible science, we have developed cloud-based Collaborative Environment for Ecosystem Science Research and Analysis (CoESRA) infrastructure to build, execute and share sophisticated computation-centric research. The CoESRA provides users with a storage and computational platform that is accessible from a web-browser in the form of a virtual desktop. Any registered user can access the virtual desktop to build, execute and share the Kepler workflows. This approach will enable computational scientists to share complete workflows in a pre-configured environment so that others can reproduce the computational research with minimal effort.As a case study, we developed and shared a complete IUCN Red List of Ecosystems Assessment workflow that reproduces the assessments undertaken by Burns et al. (2015) on Mountain Ash forests in the Central Highlands of Victoria, Australia. This workflow provides an opportunity for other researchers and stakeholders to run this assessment with minimal supervision. The workflow also enables researchers to re-evaluate the assessment when additional data becomes available. The assessment can be run in a CoESRA virtual desktop by opening a workflow in a Kepler user interface and pressing a “start” button. The workflow is pre-configured with all the open access datasets and writes results to a pre-configured folder.  相似文献   

4.
Electron tomography is the leading technique to elucidate the structure of complex biological specimens. Due to the resolution needs, huge reconstructions are required. Grid computing has the potential to face the significant computational demands involved. However, there are a number of key issues, such as stability or difficult user-grid interaction, that currently preclude fully exploitation of its potential. EGEETomo is a user-friendly application that facilitates the interaction with the grid for the non-specialized user and automates job submission and supervision. In addition, EGEETomo is supplied with an automated fault recovery mechanism, which is key to make all the work transparent to the user. EGEETomo significantly accelerates tomographic reconstruction by exploiting the computational resources in the EGEE grid with minimal user intervention. AVAILABILITY: http://www.ace.ual.es/~jrbcast/EGEETomo.tar.gz  相似文献   

5.

Background

Quantitative proteomics holds great promise for identifying proteins that are differentially abundant between populations representing different physiological or disease states. A range of computational tools is now available for both isotopically labeled and label-free liquid chromatography mass spectrometry (LC-MS) based quantitative proteomics. However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis. These limitations, along with the array of choices, present a daunting prospect for biologists, and other researchers not trained in bioinformatics, who wish to use LC-MS-based quantitative proteomics.

Results

We have developed Corra, a computational framework and tools for discovery-based LC-MS proteomics. Corra extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing) so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Corra also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS). We present two case studies to illustrate the application of Corra to commonly performed LC-MS-based biological workflows: a pilot biomarker discovery study of glycoproteins isolated from human plasma samples relevant to type 2 diabetes, and a study in yeast to identify in vivo targets of the protein kinase Ark1 via phosphopeptide profiling.

Conclusion

The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools. Corra enables appropriate statistical analyses, with controlled false-discovery rates, ultimately to inform subsequent targeted identification of differentially abundant peptides by MS/MS. For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field.  相似文献   

6.
7.
The Advanced Photon Source at Argonne National Laboratory enables structural biologists to perform state-of-the-art crystallography diffraction experiments with high-intensity X-rays. The data gathered during such experiments is used to determine the molecular structure of macromolecules to enhance, for example, the capabilities of modern drug design for basic and applied research. The steps involved in obtaining a complete structure are computationally intensive and require the proper adjustment of a considerable number of parameters that are not known a priori. Thus, it is advantageous to develop a computational infrastructure for solving the numerically complex problems quickly, in order to enable quasi-real-time information discovery and computational steering. Specifically, we propose that the time-consuming calculations be performed in a “computational grid” accessing a large number of state-of-the-art computational facilities. Furthermore, we envision that experiments could be conducted by researchers at their home institution via remote steering while a beamline technician performs the actual experiment; such an approach would be cost-efficient for the user. We conducted a case study involving multiple tasks of a structural biologist, including data acquisition, data reduction, solution of the phase problem, and calculation of the final result - an electron density map, which is subsequently used for modeling of the molecular structure. We developed a parallel program for the data reduction phase that reduces the turnaround time significantly. We also distributed the solution of the phase problem in order to obtain the resulting electron density map more quickly. We used the GUSTO testbed provided by the Globus metacomputing project as the source of the necessary state-of-the-art computational resources, including workstation clusters. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

8.
The National Fusion Collaboratory project seeks to enable fusion scientists to exploit Grid capabilities in support of experimental science. To this end we are exploring the concept of a collaborative control room that harnesses Grid and collaborative technologies to provide an environment in which remote experimental devices, codes, and expertise can interact in real time during an experiment. This concept has the potential to make fusion experiments more efficient by enabling researchers to perform more analysis and by engaging more expertise from a geographically distributed team of scientists and resources. As the realities of software development, talent distribution, and budgets increasingly encourage pooling resources and specialization, we see such environments as a necessary tool for future science. In this paper, we describe an experimental mock-up of a remote interaction with the DIII-D control room. The collaborative control room was demonstrated at SC03 and later reviewed at an international ITER Grid Workshop. We describe how the combined effect of various technologies—collaborative, visualization, and Grid—can be used effectively in experimental science. Specifically, we describe the Access Grid, experimental data presentation tools, and agreement-based resource management and workflow systems enabling time-bounded end-to-end application execution. We also report on FusionGrid services whose use during the fusion experimental cycle became possible for the first time thanks to this technology, and we discuss its potential use in future fusion experiments.  相似文献   

9.
The common scenario in computational biology in which a community of researchers conduct multiple statistical tests on one shared database gives rise to the multiple hypothesis testing problem. Conventional procedures for solving this problem control the probability of false discovery by sacrificing some of the power of the tests. We suggest a scheme for controlling false discovery without any power loss by adding new samples for each use of the database and charging the user with the expenses. The crux of the scheme is a carefully crafted pricing system that fairly prices different user requests based on their demands while keeping the probability of false discovery bounded. We demonstrate this idea in the context of HIV treatment research, where multiple researchers conduct tests on a repository of HIV samples.  相似文献   

10.
Several systems have been presented in the last years in order to manage the complexity of large microarray experiments. Although good results have been achieved, most systems tend to lack in one or more fields. A Grid based approach may provide a shared, standardized and reliable solution for storage and analysis of biological data, in order to maximize the results of experimental efforts. A Grid framework has been therefore adopted due to the necessity of remotely accessing large amounts of distributed data as well as to scale computational performances for terabyte datasets. Two different biological studies have been planned in order to highlight the benefits that can emerge from our Grid based platform. The described environment relies on storage services and computational services provided by the gLite Grid middleware. The Grid environment is also able to exploit the added value of metadata in order to let users better classify and search experiments. A state-of-art Grid portal has been implemented in order to hide the complexity of framework from end users and to make them able to easily access available services and data. The functional architecture of the portal is described. As a first test of the system performances, a gene expression analysis has been performed on a dataset of Affymetrix GeneChip Rat Expression Array RAE230A, from the ArrayExpress database. The sequence of analysis includes three steps: (i) group opening and image set uploading, (ii) normalization, and (iii) model based gene expression (based on PM/MM difference model). Two different Linux versions (sequential and parallel) of the dChip software have been developed to implement the analysis and have been tested on a cluster. From results, it emerges that the parallelization of the analysis process and the execution of parallel jobs on distributed computational resources actually improve the performances. Moreover, the Grid environment have been tested both against the possibility of uploading and accessing distributed datasets through the Grid middleware and against its ability in managing the execution of jobs on distributed computational resources. Results from the Grid test will be discussed in a further paper.  相似文献   

11.
Autism Spectrum Conditions (ASC) are characterized by heterogeneous impairments of social reciprocity and sensory processing. Voices, similar to faces, convey socially relevant information. Whether voice processing is selectively impaired remains undetermined. This study involved recording mismatch negativity (MMN) while presenting emotionally spoken syllables dada and acoustically matched nonvocal sounds to 20 subjects with ASC and 20 healthy matched controls. The people with ASC exhibited no MMN response to emotional syllables and reduced MMN to nonvocal sounds, indicating general impairments of affective voice and acoustic discrimination. Weaker angry MMN amplitudes were associated with more autistic traits. Receiver operator characteristic analysis revealed that angry MMN amplitudes yielded a value of 0.88 (p<.001). The results suggest that people with ASC may process emotional voices in an atypical fashion already at the automatic stage. This processing abnormality can facilitate diagnosing ASC and enable social deficits in people with ASC to be predicted.  相似文献   

12.
The e-NMR project is a European cooperation initiative that aims at providing the bio-NMR user community with a software platform integrating and streamlining the computational approaches necessary for the analysis of bio-NMR data. The e-NMR platform is based on a Grid computational infrastructure. A main focus of the current implementation of the e-NMR platform is on streamlining structure determination protocols. Indeed, to facilitate the use of NMR spectroscopy in the life sciences, the eNMR consortium has set out to provide protocolized services through easy-to-use web interfaces, while still retaining sufficient flexibility to handle specific requests by expert users. Various programs relevant for structural biology applications are already available through the e-NMR portal, including HADDOCK, XPLOR-NIH, CYANA and csRosetta. The implementation of these services, and in particular the distribution of calculations to the GRID infrastructure, has required the development of specific tools. However, the GRID infrastructure is maintained completely transparent to the users. With more than 150 registered users, eNMR is currently the second largest European Virtual Organization in the life sciences.  相似文献   

13.
Energy efficiency and high computing power are basic design considerations across modern-day computing solutions due to different concerns such as system performance, operational cost, and environmental issues. Desktop Grid and Volunteer Computing System (DGVCS) so called opportunistic infrastructures offer computational power at low cost focused on harvesting idle computing cycles of existing commodity computing resources. Other than allowing to customize the end user offer, virtualization is considered as one key techniques to reduce energy consumption in large-scale systems and contributes to the scalability of the system. This paper presents an energy efficient approach for opportunistic infrastructures based on task consolidation and customization of virtual machines. The experimental results with single desktops and complete computer rooms show that virtualization significantly improves the energy efficiency of opportunistic grids compared with dedicated computing systems without disturbing the end-user.  相似文献   

14.
Modelling and simulation of complex cellular transactions involve development of platforms that understand diverse mathematical representations and are capable of handling large backend computations. Grid Cellware, an integrated modelling and simulation tool, has been developed to precisely address these niche requirements of the modelling community. Grid Cellware implements various pathway simulation algorithms along with adaptive Swarm algorithm for parameter estimation. For enchanced computational productivity Grid Cellware uses grid technology with Globus as the middleware.  相似文献   

15.
Systems biology is based on computational modelling and simulation of large networks of interacting components. Models may be intended to capture processes, mechanisms, components and interactions at different levels of fidelity. Input data are often large and geographically disperse, and may require the computation to be moved to the data, not vice versa. In addition, complex system-level problems require collaboration across institutions and disciplines. Grid computing can offer robust, scaleable solutions for distributed data, compute and expertise. We illustrate some of the range of computational and data requirements in systems biology with three case studies: one requiring large computation but small data (orthologue mapping in comparative genomics), a second involving complex terabyte data (the Visible Cell project) and a third that is both computationally and data-intensive (simulations at multiple temporal and spatial scales). Authentication, authorisation and audit systems are currently not well scalable and may present bottlenecks for distributed collaboration particularly where outcomes may be commercialised. Challenges remain in providing lightweight standards to facilitate the penetration of robust, scalable grid-type computing into diverse user communities to meet the evolving demands of systems biology.  相似文献   

16.
Grid Computing consists of a collection of heterogeneous computers and resources spread across multiple administrative domains with the intent of providing users uniform access to these resources. There are many ways to access the resources of a Grid, each with unique security requirements and implications for both the resource user and the resource provider. A comprehensive set of Grid usage scenarios is presented and analyzed with regard to security requirements such as authentication, authorization, integrity, and confidentiality. The main value of these scenarios and the associated security discussions is to provide a library of situations against which an application designer can match, thereby facilitating security-aware application use and development from the initial stages of the application design and invocation. A broader goal of these scenarios is to increase the awareness of security issues in Grid Computing.  相似文献   

17.

Background  

Microarray data are often used for patient classification and gene selection. An appropriate tool for end users and biomedical researchers should combine user friendliness with statistical rigor, including carefully avoiding selection biases and allowing analysis of multiple solutions, together with access to additional functional information of selected genes. Methodologically, such a tool would be of greater use if it incorporates state-of-the-art computational approaches and makes source code available.  相似文献   

18.
A new approach to the job scheduling problem in computational grids   总被引:1,自引:0,他引:1  
Job scheduling is one of the most challenging issues in Grid resource management that strongly affects the performance of the whole Grid environment. The major drawback of the existing Grid scheduling algorithms is that they are unable to adapt with the dynamicity of the resources and the network conditions. Furthermore, the network model that is used for resource information aggregation in most scheduling methods is centralized or semi-centralized. Therefore, these methods do not scale well as Grid size grows and do not perform well as the environmental conditions change with time. This paper proposes a learning automata-based job scheduling algorithm for Grids. In this method, the workload that is placed on each Grid node is proportional to its computational capacity and varies with time according to the Grid constraints. The performance of the proposed algorithm is evaluated through conducting several simulation experiments under different Grid scenarios. The obtained results are compared with those of several existing methods. Numerical results confirm the superiority of the proposed algorithm over the others in terms of makespan, flowtime, and load balancing.  相似文献   

19.
Desktop grid systems and applications have generated significant impacts on science and engineering. The emerging convergence of grid and peer-to-peer (P2P) computing technologies further opens new opportunities for enabling P2P Desktop Grid systems. This paper presents a taxonomy for classifying P2P desktop grid implementation paradigms, aiming to summarize the state-of-the-art technologies and explore the current and potential solution space. To have a comprehensive taxonomy for P2P desktop grid paradigms, we investigate both computational and data grid systems. Moreover, to ease the understanding, the taxonomy is applied to selected case studies of P2P desktop grid systems. The taxonomy is expected to be used as a survey of the state-of-the-art, a design map, a guideline for novice researchers, a common vocabulary, or a design space for simulation and benchmark, and to be extended as the technologies rapidly evolve.  相似文献   

20.
We present a benchmark suite for computational Grids. It is based on the NAS Parallel Benchmarks (NPB) and is called NAS Grid Benchmark (NGB) in this paper. We present NGB as a data flow graph encapsulating an instance of an NPB code in each graph node, which communicates with other nodes by sending/receiving initialization data. These nodes may be mapped to the same or different Grid machines. Like NPB, NGB specifies several different classes (problem sizes). NGB also specifies the generic Grid services sufficient for running the suite. The implementor has the freedom to choose any Grid environment. We describe a reference implementation in Java, and present some scenarios for using NGB.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号