首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Systems biology is based on computational modelling and simulation of large networks of interacting components. Models may be intended to capture processes, mechanisms, components and interactions at different levels of fidelity. Input data are often large and geographically disperse, and may require the computation to be moved to the data, not vice versa. In addition, complex system-level problems require collaboration across institutions and disciplines. Grid computing can offer robust, scaleable solutions for distributed data, compute and expertise. We illustrate some of the range of computational and data requirements in systems biology with three case studies: one requiring large computation but small data (orthologue mapping in comparative genomics), a second involving complex terabyte data (the Visible Cell project) and a third that is both computationally and data-intensive (simulations at multiple temporal and spatial scales). Authentication, authorisation and audit systems are currently not well scalable and may present bottlenecks for distributed collaboration particularly where outcomes may be commercialised. Challenges remain in providing lightweight standards to facilitate the penetration of robust, scalable grid-type computing into diverse user communities to meet the evolving demands of systems biology.  相似文献   

2.
Several systems have been presented in the last years in order to manage the complexity of large microarray experiments. Although good results have been achieved, most systems tend to lack in one or more fields. A Grid based approach may provide a shared, standardized and reliable solution for storage and analysis of biological data, in order to maximize the results of experimental efforts. A Grid framework has been therefore adopted due to the necessity of remotely accessing large amounts of distributed data as well as to scale computational performances for terabyte datasets. Two different biological studies have been planned in order to highlight the benefits that can emerge from our Grid based platform. The described environment relies on storage services and computational services provided by the gLite Grid middleware. The Grid environment is also able to exploit the added value of metadata in order to let users better classify and search experiments. A state-of-art Grid portal has been implemented in order to hide the complexity of framework from end users and to make them able to easily access available services and data. The functional architecture of the portal is described. As a first test of the system performances, a gene expression analysis has been performed on a dataset of Affymetrix GeneChip Rat Expression Array RAE230A, from the ArrayExpress database. The sequence of analysis includes three steps: (i) group opening and image set uploading, (ii) normalization, and (iii) model based gene expression (based on PM/MM difference model). Two different Linux versions (sequential and parallel) of the dChip software have been developed to implement the analysis and have been tested on a cluster. From results, it emerges that the parallelization of the analysis process and the execution of parallel jobs on distributed computational resources actually improve the performances. Moreover, the Grid environment have been tested both against the possibility of uploading and accessing distributed datasets through the Grid middleware and against its ability in managing the execution of jobs on distributed computational resources. Results from the Grid test will be discussed in a further paper.  相似文献   

3.
Parametric uncertainty is a particularly challenging and relevant aspect of systems analysis in domains such as systems biology where, both for inference and for assessing prediction uncertainties, it is essential to characterize the system behavior globally in the parameter space. However, current methods based on local approximations or on Monte-Carlo sampling cope only insufficiently with high-dimensional parameter spaces associated with complex network models. Here, we propose an alternative deterministic methodology that relies on sparse polynomial approximations. We propose a deterministic computational interpolation scheme which identifies most significant expansion coefficients adaptively. We present its performance in kinetic model equations from computational systems biology with several hundred parameters and state variables, leading to numerical approximations of the parametric solution on the entire parameter space. The scheme is based on adaptive Smolyak interpolation of the parametric solution at judiciously and adaptively chosen points in parameter space. As Monte-Carlo sampling, it is “non-intrusive” and well-suited for massively parallel implementation, but affords higher convergence rates. This opens up new avenues for large-scale dynamic network analysis by enabling scaling for many applications, including parameter estimation, uncertainty quantification, and systems design.  相似文献   

4.
Space is a very important aspect in the simulation of biochemical systems; recently, the need for simulation algorithms able to cope with space is becoming more and more compelling. Complex and detailed models of biochemical systems need to deal with the movement of single molecules and particles, taking into consideration localized fluctuations, transportation phenomena, and diffusion. A common drawback of spatial models lies in their complexity: models can become very large, and their simulation could be time consuming, especially if we want to capture the systems behavior in a reliable way using stochastic methods in conjunction with a high spatial resolution. In order to deliver the promise done by systems biology to be able to understand a system as whole, we need to scale up the size of models we are able to simulate, moving from sequential to parallel simulation algorithms. In this paper, we analyze Smoldyn, a widely diffused algorithm for stochastic simulation of chemical reactions with spatial resolution and single molecule detail, and we propose an alternative, innovative implementation that exploits the parallelism of Graphics Processing Units (GPUs). The implementation executes the most computational demanding steps (computation of diffusion, unimolecular, and bimolecular reaction, as well as the most common cases of molecule-surface interaction) on the GPU, computing them in parallel on each molecule of the system. The implementation offers good speed-ups and real time, high quality graphics output  相似文献   

5.

Background  

Matlab, a powerful and productive language that allows for rapid prototyping, modeling and simulation, is widely used in computational biology. Modeling and simulation of large biological systems often require more computational resources then are available on a single computer. Existing distributed computing environments like the Distributed Computing Toolbox, MatlabMPI, Matlab*G and others allow for the remote (and possibly parallel) execution of Matlab commands with varying support for features like an easy-to-use application programming interface, load-balanced utilization of resources, extensibility over the wide area network, and minimal system administration skill requirements. However, all of these environments require some level of access to participating machines to manually distribute the user-defined libraries that the remote call may invoke.  相似文献   

6.
Zhang X  Luo B  Fang X  Pan L 《Bio Systems》2012,108(1-3):52-62
Spiking neural P systems (SN P systems, for short) are a class of distributed parallel computing devices inspired from the way neurons communicate by means of spikes, where neurons work in parallel in the sense that each neuron that can fire should fire, but the work in each neuron is sequential in the sense that at most one rule can be applied at each computation step. In this work, we consider SN P systems with the restriction that at most one neuron can fire at each step, and each neuron works in an exhaustive manner (a kind of local parallelism - an applicable rule in a neuron is used as many times as possible). Such SN P systems are called sequential SN P systems with exhaustive use of rules. The computation power of sequential SN P systems with exhaustive use of rules is investigated. Specifically, characterizations of Turing computability and of semilinear sets of numbers are obtained, as well as a strict superclass of semilinear sets is generated. The results show that the computation power of sequential SN P systems with exhaustive use of rules is closely related with the types of spiking rules in neurons.  相似文献   

7.
Cactus Tools for Grid Applications   总被引:3,自引:0,他引:3  
Cactus is an open source problem solving environment designed for scientists and engineers. Its modular structure facilitates parallel computation across different architectures and collaborative code development between different groups. The Cactus Code originated in the academic research community, where it has been developed and used over many years by a large international collaboration of physicists and computational scientists. We discuss here how the intensive computing requirements of physics applications now using the Cactus Code encourage the use of distributed and metacomputing, and detail how its design makes it an ideal application test-bed for Grid computing. We describe the development of tools, and the experiments which have already been performed in a Grid environment with Cactus, including distributed simulations, remote monitoring and steering, and data handling and visualization. Finally, we discuss how Grid portals, such as those already developed for Cactus, will open the door to global computing resources for scientific users.  相似文献   

8.
Cluster Computing - The cloud evolved into an attractive execution environment for parallel applications, which make use of compute resources to speed up the computation of large problems in...  相似文献   

9.
Buiu C  Arsene O  Cipu C  Patrascu M 《Bio Systems》2011,103(3):442-447
A P system represents a distributed and parallel bio-inspired computing model in which basic data structures are multi-sets or strings. Numerical P systems have been recently introduced and they use numerical variables and local programs (or evolution rules), usually in a deterministic way. They may find interesting applications in areas such as computational biology, process control or robotics. The first simulator of numerical P systems (SNUPS) has been designed, implemented and made available to the scientific community by the authors of this paper. SNUPS allows a wide range of applications, from modeling and simulation of ordinary differential equations, to the use of membrane systems as computational blocks of cognitive architectures, and as controllers for autonomous mobile robots. This paper describes the functioning of a numerical P system and presents an overview of SNUPS capabilities together with an illustrative example. Availability: SNUPS is freely available to researchers as a standalone application and may be downloaded from a dedicated website, http://snups.ics.pub.ro/, which includes an user manual and sample membrane structures.  相似文献   

10.
Researchers are perpetually amassing biological sequence data. The computational approaches employed by ecologists for organizing this data (e.g. alignment, phylogeny, etc.) typically scale nonlinearly in execution time with the size of the dataset. This often serves as a bottleneck for processing experimental data since many molecular studies are characterized by massive datasets. To keep up with experimental data demands, ecologists are forced to choose between continually upgrading expensive in-house computer hardware or outsourcing the most demanding computations to the cloud. Outsourcing is attractive since it is the least expensive option, but does not necessarily allow direct user interaction with the data for exploratory analysis. Desktop analytical tools such as ARB are indispensable for this purpose, but they do not necessarily offer a convenient solution for the coordination and integration of datasets between local and outsourced destinations. Therefore, researchers are currently left with an undesirable tradeoff between computational throughput and analytical capability. To mitigate this tradeoff we introduce a software package to leverage the utility of the interactive exploratory tools offered by ARB with the computational throughput of cloud-based resources. Our pipeline serves as middleware between the desktop and the cloud allowing researchers to form local custom databases containing sequences and metadata from multiple resources and a method for linking data outsourced for computation back to the local database. A tutorial implementation of the toolkit is provided in the supporting information, S1 Tutorial. Availability: http://www.ece.drexel.edu/gailr/EESI/tutorial.php.  相似文献   

11.
The control of compliant robots is, due to their often nonlinear and complex dynamics, inherently difficult. The vision of morphological computation proposes to view these aspects not only as problems, but rather also as parts of the solution. Non-rigid body parts are not seen anymore as imperfect realizations of rigid body parts, but rather as potential computational resources. The applicability of this vision has already been demonstrated for a variety of complex robot control problems. Nevertheless, a theoretical basis for understanding the capabilities and limitations of morphological computation has been missing so far. We present a model for morphological computation with compliant bodies, where a precise mathematical characterization of the potential computational contribution of a complex physical body is feasible. The theory suggests that complexity and nonlinearity, typically unwanted properties of robots, are desired features in order to provide computational power. We demonstrate that simple generic models of physical bodies, based on mass-spring systems, can be used to implement complex nonlinear operators. By adding a simple readout (which is static and linear) to the morphology such devices are able to emulate complex mappings of input to output streams in continuous time. Hence, by outsourcing parts of the computation to the physical body, the difficult problem of learning to control a complex body, could be reduced to a simple and perspicuous learning task, which can not get stuck in local minima of an error function.  相似文献   

12.
Engineers have a lot to gain from studying biology. The study of biological neural systems alone provides numerous examples of computational systems that are far more complex than any man-made system and perform real-time sensory and motor tasks in a manner that humbles the most advanced artificial systems. Despite the evolutionary genesis of these systems and the vast apparent differences between species, there are common design strategies employed by biological systems that span taxa, and engineers would do well to emulate these strategies. However, biologically-inspired computational architectures, which are continuous-time and parallel in nature, do not map well onto conventional processors, which are discrete-time and serial in operation. Rather, an implementation technology that is capable of directly realizing the layered parallel structure and nonlinear elements employed by neurobiology is required for power- and space-efficient implementation. Custom neuromorphic hardware meets these criteria and yields low-power dedicated sensory systems that are small, light, and ideal for autonomous robot applications. As examples of how this technology is applied, this article describes both a low-level neuromorphic hardware emulation of an elementary visual motion detector, and a large-scale, system-level spatial motion integration system.  相似文献   

13.

Background

The clinical decision support system can effectively break the limitations of doctors’ knowledge and reduce the possibility of misdiagnosis to enhance health care. The traditional genetic data storage and analysis methods based on stand-alone environment are hard to meet the computational requirements with the rapid genetic data growth for the limited scalability.

Methods

In this paper, we propose a distributed gene clinical decision support system, which is named GCDSS. And a prototype is implemented based on cloud computing technology. At the same time, we present CloudBWA which is a novel distributed read mapping algorithm leveraging batch processing strategy to map reads on Apache Spark.

Results

Experiments show that the distributed gene clinical decision support system GCDSS and the distributed read mapping algorithm CloudBWA have outstanding performance and excellent scalability. Compared with state-of-the-art distributed algorithms, CloudBWA achieves up to 2.63 times speedup over SparkBWA. Compared with stand-alone algorithms, CloudBWA with 16 cores achieves up to 11.59 times speedup over BWA-MEM with 1 core.

Conclusions

GCDSS is a distributed gene clinical decision support system based on cloud computing techniques. In particular, we incorporated a distributed genetic data analysis pipeline framework in the proposed GCDSS system. To boost the data processing of GCDSS, we propose CloudBWA, which is a novel distributed read mapping algorithm to leverage batch processing technique in mapping stage using Apache Spark platform.
  相似文献   

14.
Program development environments have enabled graphics processing units (GPUs) to become an attractive high performance computing platform for the scientific community. A commonly posed problem in computational biology is protein database searching for functional similarities. The most accurate algorithm for sequence alignments is Smith-Waterman (SW). However, due to its computational complexity and rapidly increasing database sizes, the process becomes more and more time consuming making cluster based systems more desirable. Therefore, scalable and highly parallel methods are necessary to make SW a viable solution for life science researchers. In this paper we evaluate how SW fits onto the target GPU architecture by exploring ways to map the program architecture on the processor architecture. We develop new techniques to reduce the memory footprint of the application while exploiting the memory hierarchy of the GPU. With this implementation, GSW, we overcome the on chip memory size constraint, achieving 23× speedup compared to a serial implementation. Results show that as the query length increases our speedup almost stays stable indicating the solid scalability of our approach. Additionally this is a first of a kind implementation which purely runs on the GPU instead of a CPU-GPU integrated environment, making our design suitable for porting onto a cluster of GPUs.  相似文献   

15.
We have evaluated reconstruction methods using smooth basis functions in the electron tomography of complex biological specimens. In particular, we have investigated series expansion methods, with special emphasis on parallel computation. Among the methods investigated, the component averaging techniques have proven to be most efficient and have generally shown fast convergence rates. The use of smooth basis functions provides the reconstruction algorithms with an implicit regularization mechanism, very appropriate for noisy conditions. Furthermore, we have applied high-performance computing (HPC) techniques to address the computational requirements demanded by the reconstruction of large volumes. One of the standard techniques in parallel computing, domain decomposition, has yielded an effective computational algorithm which hides the latencies due to interprocessor communication. We present comparisons with weighted back-projection (WBP), one of the standard reconstruction methods in the areas of computational demand and reconstruction quality under noisy conditions. These techniques yield better results, according to objective measures of quality, than the weighted backprojection techniques after a very few iterations. As a consequence, the combination of efficient iterative algorithms and HPC techniques has proven to be well suited to the reconstruction of large biological specimens in electron tomography, yielding solutions in reasonable computation times.  相似文献   

16.
Goal, Scope, and Background As Life Cycle Assessment (LCA) and Input-Output Analysis (IOA) systems increase in size, computation times and memory usage can increase rapidly. The use of efficient methods of solution allows the use of a wide range of analysis techniques. Some techniques, such as Monte-Carlo Analysis, may be limited if computational times are too slow. Discussion of Methods In this article, I describe algorithms that substantially reduce computation times and memory usage for solving LCA and IOA systems and performing Monte-Carlo analysis. The algorithms are based on well-established iterative methods of solving linear systems and exploit the power series expansion of the Leontief inverse. The algorithms are further enhanced by using sparse matrix algebra. Results and Discussion The algorithms presented in this article reduce computational time and memory usage by orders of magnitude, while still retaining a high degree of accuracy. For a 3225×3225 LCA system, the algorithm reduced computation time from 70s to 0.06s while retaining an accuracy of 10−3%. Storage was reduced from 166 megabytes to 1.8 megabytes. The algorithm was used to perform a Monte-Carlo analysis on the same system with 1,000 samples in 90s. I also discuss various issues of power series convergence for general LCA and IOA systems and show that convergence will generally hold due to the mathematical structure of LCA and IOA systems. Conclusions By exploiting the mathematical structure of LCA and IOA iterative techniques substantially reduced the computational times required for solving LCA and IOA systems and for performing Monte-Carlo simulations. This allows more wide-spread implementation analysis techniques, such as Monte-Carlo analysis, in LCA and IOA. Recommendations and Perspectives It is suggested that algorithms, such as the ones described in this article, should be implemented in LCA packages. Various checks can be used to verify that computational errors are kept to a minimum.  相似文献   

17.

Background  

We consider the problem of parameter estimation (model calibration) in nonlinear dynamic models of biological systems. Due to the frequent ill-conditioning and multi-modality of many of these problems, traditional local methods usually fail (unless initialized with very good guesses of the parameter vector). In order to surmount these difficulties, global optimization (GO) methods have been suggested as robust alternatives. Currently, deterministic GO methods can not solve problems of realistic size within this class in reasonable computation times. In contrast, certain types of stochastic GO methods have shown promising results, although the computational cost remains large. Rodriguez-Fernandez and coworkers have presented hybrid stochastic-deterministic GO methods which could reduce computation time by one order of magnitude while guaranteeing robustness. Our goal here was to further reduce the computational effort without loosing robustness.  相似文献   

18.
Reverse computation is presented here as an important future direction in addressing the challenge of fault tolerant execution on very large cluster platforms for parallel computing. As the scale of parallel jobs increases, traditional checkpointing approaches suffer scalability problems ranging from computational slowdowns to high congestion at the persistent stores for checkpoints. Reverse computation can overcome such problems and is also better suited for parallel computing on newer architectures with smaller, cheaper or energy-efficient memories and file systems. Initial evidence for the feasibility of reverse computation in large systems is presented with detailed performance data from a particle (ideal gas) simulation scaling to 65,536 processor cores and 950 accelerators (GPUs). Reverse computation is observed to deliver very large gains relative to checkpointing schemes when nodes rely on their host processors/memory to tolerate faults at their accelerators. A comparison between reverse computation and checkpointing with measurements such as cache miss ratios, TLB misses and memory usage indicates that reverse computation is hard to ignore as a future alternative to be pursued in emerging architectures.  相似文献   

19.
Previously, DAG scheduling schemes used the mean (average) of computation or communication time in dealing with temporal heterogeneity. However, it is not optimal to consider only the means of computation and communication times in DAG scheduling on a temporally (and spatially) heterogeneous distributed computing system. In this paper, it is proposed that the second order moments of computation and communication times, such as the standard deviations, be taken into account in addition to their means, in scheduling “stochastic” DAGs. An effective scheduling approach which accurately estimates the earliest start time of each node and derives a schedule leading to a shorter average parallel execution time has been developed. Through an extensive computer simulation, it has been shown that a significant improvement (reduction) in the average parallel execution times of stochastic DAGs can be achieved by the proposed approach.  相似文献   

20.
Information transfer, measured by transfer entropy, is a key component of distributed computation. It is therefore important to understand the pattern of information transfer in order to unravel the distributed computational algorithms of a system. Since in many natural systems distributed computation is thought to rely on rhythmic processes a frequency resolved measure of information transfer is highly desirable. Here, we present a novel algorithm, and its efficient implementation, to identify separately frequencies sending and receiving information in a network. Our approach relies on the invertible maximum overlap discrete wavelet transform (MODWT) for the creation of surrogate data in the computation of transfer entropy and entirely avoids filtering of the original signals. The approach thereby avoids well-known problems due to phase shifts or the ineffectiveness of filtering in the information theoretic setting. We also show that measuring frequency-resolved information transfer is a partial information decomposition problem that cannot be fully resolved to date and discuss the implications of this issue. Last, we evaluate the performance of our algorithm on simulated data and apply it to human magnetoencephalography (MEG) recordings and to local field potential recordings in the ferret. In human MEG we demonstrate top-down information flow in temporal cortex from very high frequencies (above 100Hz) to both similarly high frequencies and to frequencies around 20Hz, i.e. a complex spectral configuration of cortical information transmission that has not been described before. In the ferret we show that the prefrontal cortex sends information at low frequencies (4-8 Hz) to early visual cortex (V1), while V1 receives the information at high frequencies (> 125 Hz).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号