期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

FPNA: interaction between FPGA and neural computation

Girau B 《International journal of neural systems》2000,10(3):243-259

Neural networks are usually considered as naturally parallel computing models. But the number of operators and the complex connection graph of standard neural models can not be directly handled by digital hardware devices. More particularly, several works show that programmable digital hardware is a real opportunity for flexible hardware implementations of neural networks. And yet many area and topology problems arise when standard neural models are implemented onto programmable circuits such as FPGAs, so that the fast FPGA technology improvements can not be fully exploited. Therefore neural network hardware implementations need to reconcile simple hardware topologies with complex neural architectures. The theoretical and practical framework developed, allows this combination thanks to some principles of configurable hardware that are applied to neural computation: Field Programmable Neural Arrays (FPNA) lead to powerful neural architectures that are easy to map onto FPGAs, thanks to a simplified topology and an original data exchange scheme. This paper shows how FPGAs have led to the definition of the FPNA computation paradigm. Then it shows how FPNAs contribute to current and future FPGA-based neural implementations by solving the general problems that are raised by the implementation of complex neural networks onto FPGAs. 相似文献

2.

Software Architecture for Processing Clusters Based on I2O

J. Gutleber L. Orsini 《Cluster computing》2002,5(1):55-64

Mainstream computing equipment and the advent of affordable multi-Gigabit communication technology permit us to address data acquisition and processing problems with clusters of COTS machinery. Such networks typically contain heterogeneous platforms, real-time partitions and even custom devices. Vital overall system requirements are high efficiency and flexibility. In preceding projects we experienced the difficulties to meet both requirements at once. Intelligent I/O (I₂O) is an industry specification that defines a uniform messaging format and execution environment for hardware and operating system independent device drivers in systems with processor based communication equipment. Mapping this concept to a distributed computing environment and encapsulating the details of the specification into an application-programming framework allow us to provide architectural support for (i) efficient and (ii) extensible cluster operation. This paper portrays our view of applying I₂O to high-performance clusters. We demonstrate the feasibility of this approach and report on the efficiency of our XDAQ software framework for distributed data acquisition systems. 相似文献

3.

Retrofitting Autonomic Capabilities onto Legacy Systems

Janak Parekh Gail Kaiser Philip Gross Giuseppe Valetto 《Cluster computing》2006,9(2):141-159

sec:abstractnak Autonomic computing—self-configuring, self-healing, self-managing applications, systems and networks—is a promising solution to ever-increasing system complexity and the spiraling costs of human management as systems scale to global proportions. Most results to date, however, suggest ways to architect new software designed from the ground up as autonomic systems, whereas in the real world organizations continue to use stovepipe legacy systems and/or build “systems of systems” that draw from a gamut of disparate technologies from numerous vendors. Our goal is to retrofit autonomic computing onto such systems, externally, without any need to understand, modify or even recompile the target system's code. We present an autonomic infrastructure that operates similarly to active middleware, to explicitly add autonomic services to pre-existing systems via continual monitoring and a feedback loop that performs reconfiguration and/or repair as needed. Our lightweight design and separation of concerns enables easy adoption of individual components for use with a variety of target systems, independent of the rest of the full infrastructure. This work has been validated by several case studies spanning multiple real-world application domains. 相似文献

4.

Architecture and applications for an All-FPGA parallel computer

Yamuna Rajasekhar Ron Sass 《Cluster computing》2014,17(2):315-325

The Reconfigurable Computing Cluster (RCC) project has been investigating unconventional architectures for high end computing using a cluster of FPGA devices connected by a high-speed, custom network. Most applications use the FPGAs to realize an embedded System-on-a-Chip (SoC) design augmented with application-specific accelerators to form a message-passing parallel computer. Other applications take a single accelerator core and tessellate the core across all of the devices, treating them like a large virtual FPGA. The experimental hardware has also been used for basic computer research by emulating novel architectures. This article discusses the genesis of the over-arching project, summarizes results of individual investigations that have been completed, and how this approach may prove useful in the investigation of future Exascale systems. 相似文献

5.

Dynamic Service Discovery for Mobile Computing: Intelligent Agents Meet Jini in the Aether 总被引：1，自引：0，他引：1

Harry Chen Anupam Joshi Timothy Finin 《Cluster computing》2001,4(4):343-354

The emergence of ad-hoc pervasive connectivity for devices based on Bluetooth-like systems provides a new way to create applications for mobile systems. We seek to realize ubiquitous computing systems based on the cooperation of autonomous, dynamic and adaptive components (hardware as well as software) which are located in vicinity of one another. In this paper we present this vision. We also describe a prototype system we have developed that implements parts of this vision – in particular a system that combines agent oriented and service oriented approaches and provides dynamic service discovery. We point out why existing systems such as Jini are not suited for this task, and how our system improves on them. 相似文献

6.

Efficient Universal Computing Architectures for Decoding Neural Activity

Benjamin I. Rapoport Lorenzo Turicchia Woradorn Wattanapanitch Thomas J. Davidson Rahul Sarpeshkar 《PloS one》2012,7(9)

The ability to decode neural activity into meaningful control signals for prosthetic devices is critical to the development of clinically useful brain– machine interfaces (BMIs). Such systems require input from tens to hundreds of brain-implanted recording electrodes in order to deliver robust and accurate performance; in serving that primary function they should also minimize power dissipation in order to avoid damaging neural tissue; and they should transmit data wirelessly in order to minimize the risk of infection associated with chronic, transcutaneous implants. Electronic architectures for brain– machine interfaces must therefore minimize size and power consumption, while maximizing the ability to compress data to be transmitted over limited-bandwidth wireless channels. Here we present a system of extremely low computational complexity, designed for real-time decoding of neural signals, and suited for highly scalable implantable systems. Our programmable architecture is an explicit implementation of a universal computing machine emulating the dynamics of a network of integrate-and-fire neurons; it requires no arithmetic operations except for counting, and decodes neural signals using only computationally inexpensive logic operations. The simplicity of this architecture does not compromise its ability to compress raw neural data by factors greater than . We describe a set of decoding algorithms based on this computational architecture, one designed to operate within an implanted system, minimizing its power consumption and data transmission bandwidth; and a complementary set of algorithms for learning, programming the decoder, and postprocessing the decoded output, designed to operate in an external, nonimplanted unit. The implementation of the implantable portion is estimated to require fewer than 5000 operations per second. A proof-of-concept, 32-channel field-programmable gate array (FPGA) implementation of this portion is consequently energy efficient. We validate the performance of our overall system by decoding electrophysiologic data from a behaving rodent. 相似文献

7.

Embryonic electronics. 总被引：1，自引：0，他引：1

D Mange M Sipper P Marchal 《Bio Systems》1999,51(3):145-152

相似文献

8.

Performance portability on EARTH: a case study across several parallel architectures

Weirong Zhu Yanwei Niu Guang R. Gao 《Cluster computing》2007,10(2):115-126

Due to the increase of the diversity of parallel architectures, and the increasing development time for parallel applications, performance portability has become one of the major considerations when designing the next generation of parallel program execution models, APIs, and runtime system software. This paper analyzes both code portability and performance portability of parallel programs for fine-grained multi-threaded execution and architecture models. We concentrate on one particular event-driven fine-grained multi-threaded execution model—EARTH, and discuss several design considerations of the EARTH model and runtime system that contribute to the performance portability of parallel applications. We believe that these are important issues for future high end computing system software design. Four representative benchmarks were conducted on several different parallel architectures, including two clusters listed in the 23rd supercomputer TOP500 list. The results demonstrate that EARTH based programs can achieve robust performance portability across the selected hardware platforms without any code modification or tuning. 相似文献

9.

Neurobiological approach to computing devices

P Erdi 《Bio Systems》1988,21(2):125-133

According to the old metaphor of classical cybernetics the brain can be considered as a computer. Newer theoretical endeavours reverse the question and ask: what could neurobiology offer to engineers of near-future generation computer systems? Three not completely disjoint abstract functions of the nervous system, namely pattern formation, pattern recognition and action, can be treated in a unified conceptual framework. Storage and retrieval mechanisms of information are connected to fault-tolerant, adaptive parallel structures. "Learning" and "plastic behaviour" are interpreted in terms of the theory of non-linear dynamic systems. As neural development and plasticity can be approached by deterministic models superimposed by random influence, noise might also have a positive role to play during the operation of technical computing devices. Molecular computation is discussed in relation to eventual hardware realization of "neurobiology-based" computers. 相似文献

10.

Dynamic load balancing on heterogeneous clusters for parallel ant colony optimization

Antonio Llanes José M. Cecilia Antonia Sánchez José M. García Martyn Amos Manuel Ujaldón 《Cluster computing》2016,19(1):1-11

Ant colony optimisation (ACO) is a nature-inspired, population-based metaheuristic that has been used to solve a wide variety of computationally hard problems. In order to take full advantage of the inherently stochastic and distributed nature of the method, we describe a parallelization strategy that leverages these features on heterogeneous and large-scale, massively-parallel hardware systems. Our approach balances workload effectively, by dynamically assigning jobs to heterogeneous resources which then run ACO implementations using different search strategies. Our experimental results confirm that we can obtain significant improvements in terms of both solution quality and energy expenditure, thus opening up new possibilities for the development of metaheuristic-based solutions to “real world” problems on high-performance, energy-efficient contemporary heterogeneous computing platforms. 相似文献

11.

Security and privacy qualities of medical devices: an analysis of FDA postmarket surveillance

DB Kramer M Baker B Ransford A Molina-Markham Q Stewart K Fu MR Reynolds 《PloS one》2012,7(7):e40200

Background

Medical devices increasingly depend on computing functions such as wireless communication and Internet connectivity for software-based control of therapies and network-based transmission of patients’ stored medical information. These computing capabilities introduce security and privacy risks, yet little is known about the prevalence of such risks within the clinical setting.

Methods

We used three comprehensive, publicly available databases maintained by the Food and Drug Administration (FDA) to evaluate recalls and adverse events related to security and privacy risks of medical devices.

Results

Review of weekly enforcement reports identified 1,845 recalls; 605 (32.8%) of these included computers, 35 (1.9%) stored patient data, and 31 (1.7%) were capable of wireless communication. Searches of databases specific to recalls and adverse events identified only one event with a specific connection to security or privacy. Software-related recalls were relatively common, and most (81.8%) mentioned the possibility of upgrades, though only half of these provided specific instructions for the update mechanism.

Conclusions

Our review of recalls and adverse events from federal government databases reveals sharp inconsistencies with databases at individual providers with respect to security and privacy risks. Recalls related to software may increase security risks because of unprotected update and correction mechanisms. To detect signals of security and privacy problems that adversely affect public health, federal postmarket surveillance strategies should rethink how to effectively and efficiently collect data on security and privacy problems in devices that increasingly depend on computing systems susceptible to malware. 相似文献

12.

Characterization and Compensation of Network-Level Anomalies in Mixed-Signal Neuromorphic Modeling Platforms

Mihai A. Petrovici Bernhard Vogginger Paul Müller Oliver Breitwieser Mikael Lundqvist Lyle Muller Matthias Ehrlich Alain Destexhe Anders Lansner René Schüffny Johannes Schemmel Karlheinz Meier 《PloS one》2014,9(10)

Advancing the size and complexity of neural network models leads to an ever increasing demand for computational resources for their simulation. Neuromorphic devices offer a number of advantages over conventional computing architectures, such as high emulation speed or low power consumption, but this usually comes at the price of reduced configurability and precision. In this article, we investigate the consequences of several such factors that are common to neuromorphic devices, more specifically limited hardware resources, limited parameter configurability and parameter variations due to fixed-pattern noise and trial-to-trial variability. Our final aim is to provide an array of methods for coping with such inevitable distortion mechanisms. As a platform for testing our proposed strategies, we use an executable system specification (ESS) of the BrainScaleS neuromorphic system, which has been designed as a universal emulation back-end for neuroscientific modeling. We address the most essential limitations of this device in detail and study their effects on three prototypical benchmark network models within a well-defined, systematic workflow. For each network model, we start by defining quantifiable functionality measures by which we then assess the effects of typical hardware-specific distortion mechanisms, both in idealized software simulations and on the ESS. For those effects that cause unacceptable deviations from the original network dynamics, we suggest generic compensation mechanisms and demonstrate their effectiveness. Both the suggested workflow and the investigated compensation mechanisms are largely back-end independent and do not require additional hardware configurability beyond the one required to emulate the benchmark networks in the first place. We hereby provide a generic methodological environment for configurable neuromorphic devices that are targeted at emulating large-scale, functional neural networks. 相似文献

13.

Predictable quality of service atop degradable distributed systems

Lavanya Ramakrishnan Daniel A. Reed 《Cluster computing》2013,16(2):321-334

High performance and distributed computing systems such as peta-scale, grid and cloud infrastructure are increasingly used for running scientific models and business services. These systems experience large availability variations through hardware and software failures. Resource providers need to account for these variations while providing the required QoS at appropriate costs in dynamic resource and application environments. Although the performance and reliability of these systems have been studied separately, there has been little analysis of the lost Quality of Service (QoS) experienced with varying availability levels. In this paper, we present a resource performability model to estimate lost performance and corresponding cost considerations with varying availability levels. We use the resulting model in a multi-phase planning approach for scheduling a set of deadline-sensitive meteorological workflows atop grid and cloud resources to trade-off performance, reliability and cost. We use simulation results driven by failure data collected over the lifetime of high performance systems to demonstrate how the proposed scheme better accounts for resource availability. 相似文献

14.

A lightweight,inexpensive robotic system for insect vision

《Arthropod Structure & Development》2017,46(5):689-702

Designing hardware for miniaturized robotics which mimics the capabilities of flying insects is of interest, because they share similar constraints (i.e. small size, low weight, and low energy consumption). Research in this area aims to enable robots with similarly efficient flight and cognitive abilities. Visual processing is important to flying insects' impressive flight capabilities, but currently, embodiment of insect-like visual systems is limited by the hardware systems available. Suitable hardware is either prohibitively expensive, difficult to reproduce, cannot accurately simulate insect vision characteristics, and/or is too heavy for small robotic platforms. These limitations hamper the development of platforms for embodiment which in turn hampers the progress on understanding of how biological systems fundamentally work. To address this gap, this paper proposes an inexpensive, lightweight robotic system for modelling insect vision. The system is mounted and tested on a robotic platform for mobile applications, and then the camera and insect vision models are evaluated. We analyse the potential of the system for use in embodiment of higher-level visual processes (i.e. motion detection) and also for development of navigation based on vision for robotics in general. Optic flow from sample camera data is calculated and compared to a perfect, simulated bee world showing an excellent resemblance. 相似文献

15.

Providing platform heterogeneity-awareness for data center power management

Ripal Nathuji Canturk Isci Eugene Gorbatov Karsten Schwan 《Cluster computing》2008,11(3):259-271

相似文献

16.

Microfluidic immunoaffinity separations for bioanalysis

Peoples MC Karnes HT 《Journal of chromatography. B, Analytical technologies in the biomedical and life sciences》2008,866(1-2):14-25

Microfluidic devices often rely on antibody-antigen interactions as a means of separating analytes of interest from sample matrices. Immunoassays and immunoaffinity separations performed in miniaturized formats offer selective target isolation with minimal reagent consumption and reduced analysis times. The introduction of biological fluids and other complicated matrices often requires sample pretreatment or system modifications for compatibility with small-scale devices. Miniaturization of external equipment facilitates the potential for portable use such as in patient point-of-care settings. Microfluidic immunoaffinity systems including capillary and chip platforms have been assembled from basic instrument components for fluid control, sample introduction, and detection. The current review focuses on the use of immunoaffinity separations in microfluidic devices with an emphasis on pump-based flow and biological sample analysis. 相似文献

17.

Improving Data Access for Computational Grid Applications

Ron Oldfield David Kotz 《Cluster computing》2006,9(1):79-99

High-performance computing increasingly occurs on “computational grids” composed of heterogeneous and geographically distributed systems of computers, networks, and storage devices that collectively act as a single “virtual” computer. A key challenge in this environment is to provide efficient access to data distributed across remote data servers. Our parallel I/O framework, called Armada, allows application and data-set providers to flexibly compose graphs of processing modules that describe the distribution, application interfaces, and processing required of the dataset before computation. Although the framework provides a simple programming model for the application programmer and the data-set provider, the resulting graph may contain bottlenecks that prevent efficient data access. In this paper, we present an algorithm used to restructure Armada graphs that distributes computation and data flow to improve performance in the context of a wide-area computational grid. This work was supported by Sandia National Laboratories under DOE contract DOE-AV6184. Ron A. Oldfield is a senior member of the technical staff at Sandia National Laboratories in Albuquerque, NM. He received the B.Sc. in computer science from the University of New Mexico in 1993. From 1993 to 1997, he worked in the computational sciences department of Sandia National Laboratories, where he specialized in seismic research and parallel I/O. He was the primary developer for the GONII-SSD (Gas and Oil National Information Infrastructure–Synthetic Seismic Dataset) project and a co-developer for the R&D 100 award winning project “Salvo”, a project to develop a 3D finite-difference prestack-depth migration algorithm for massively parallel architectures. From 1997 to 2003 he attended graduate school at Dartmouth college and received his Ph.D. in June, 2003. In September of 2003, he returned to Sandia to work in the Scalable Computing Systems department. His research interests include parallel and distributed computing, parallel I/O, and mobile computing. David Kotz is a Professor of Computer Science at Dartmouth College in Hanover NH. After receiving his A.B. in Computer Science and Physics from Dartmouth in 1986, he completed his Ph.D in Computer Science from Duke University in 1991. He returned to Dartmouth to join the faculty in 1991, where he is now Professor of Computer Science, Director of the Center for Mobile Computing, and Executive Director of the Institute for Security Technology Studies. His research interests include context-aware mobile computing, pervasive computing, wireless networks, and intrusion detection. He is a member of the ACM, IEEE Computer Society, and USENIX associations, and of Computer Professionals for Social Responsibility. For more information see http://www.cs.dartmouth.edu/dfk/. 相似文献

18.

A novel high efficiency,low maintenance,hydroponic system for synchronous growth and flowering of <Emphasis Type="Italic">Arabidopsis thaliana</Emphasis>

Pierre?Tocquin Email author Laurent?Corbesier Andrée?Havelange Alexandra?Pieltain Emile?Kurtem Georges?Bernier Claire?Périlleux 《BMC plant biology》2003,3(1):2

Background

Arabidopsis thaliana is now the model organism for genetic and molecular plant studies, but growing conditions may still impair the significance and reproducibility of the experimental strategies developed. Besides the use of phytotronic cabinets, controlling plant nutrition may be critical and could be achieved in hydroponics. The availability of such a system would also greatly facilitate studies dealing with root development. However, because of its small size and rosette growth habit, Arabidopsis is hardly grown in standard hydroponic devices and the systems described in the last years are still difficult to transpose at a large scale. Our aim was to design and optimize an up-scalable device that would be adaptable to any experimental conditions. 相似文献

19.

A self-healing technique using reusable component-level operation knowledge

Teruyoshi Zenmyo Hideki Yoshida Tetsuro Kimura 《Cluster computing》2007,10(4):385-394

Although autonomic computing reduces traditional operational cost, it introduces another cost factor related to operation knowledge. This paper focuses on self-healing functionality and proposes a technique which uses reusable component-level operation knowledge. To achieve reusability, the operation knowledge used by the proposed technique excludes system specific information. Such knowledge becomes independent of a specific system structure, and therefore, self-healing systems can share the operation knowledge across organizations and can adapt to changes. A problem on achieving reusability by excluding system specific information is treatment of dependency among components. To cope with this problem, a dependency injection mechanism is introduced. The dependency injection mechanism works out needed recovery actions by relating component-level operation knowledge and system-specific information. Furthermore, this paper describes an implemented prototype together with an application example. 相似文献

20.

Using System Emulation to Model Next-Generation Shared Virtual Memory Clusters

Angelos Bilas Courtney R. Gibson Reza Azimi Rosalia Christodoulopoulou Peter Jamieson 《Cluster computing》2003,6(4):325-338

Recently much effort has been spent on providing a shared address space abstraction on clusters of small-scale symmetric multiprocessors. However, advances in technology will soon make it possible to construct these clusters with larger-scale cc-NUMA nodes, connected with non-coherent networks that offer latencies and bandwidth comparable to interconnection networks used in hardware cache-coherent systems. The shared memory abstraction can be provided on these systems in software across nodes and hardware within nodes.Recent simulation results have demonstrated that certain features of modern system area networks can be used to greatly reduce shared virtual memory (SVM) overheads [5,19]. In this work we leverage these results and we use detailed system emulation to investigate building future software shared memory clusters. We use an existing, large-scale hardware cache-coherent system with 64 processors to emulate a complete future cluster. We port our existing infrastructure (communication layer and shared memory protocol) on this system and study the behavior of a set of real applications. We present results for both 32- and 64-processor system configurations.We find that: (i) System emulation is invaluable in quantifying potential benefits from changes in the technology of commodity components. More importantly, it reveals potential problems in future systems that are easily overlooked in simulation studies. Thus, system emulation should be used along with other modeling techniques (e.g., simulation, implementation) to investigate future trends. (ii) Our work shows that current SVM protocols can only partially take advantage of faster interconnects and wider nodes due to operating system and architectural implications. We quantify the related issues and identify the areas where more research is required for future SVM clusters. 相似文献