期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Scalable Session Locking for a Distributed File System

Randal C. Burns Robert M. Rees Larry J. Stockmeyer Darrell D.E. Long 《Cluster computing》2001,4(4):295-306

File systems provide an interface for applications to obtain exclusive access to files, in which a process holds privileges to a file that cannot be preempted and restrict the capabilities of other processes. Local file systems do this by maintaining information about the privileges of current file sessions, and checking subsequent sessions for compatibility. Implementing exclusive access in this manner for distributed file systems degrades performance by requiring every new file session to be registered with a lock server that maintains global session state. We present two techniques for improving the performance of session management in the distributed environment. We introduce a distributed lock for managing file access, called a semi-preemptible lock, that allows clients to cache privileges. Under a semi-preemptible lock, a file system creates new sessions without messages to the lock manager. This improves performance by exploiting locality – the affinity of files to clients. We also present data structures and algorithms for the dynamic evaluation of locks that allow a distributed file system to efficiently manage arbitrarily complex locking. In this case, complex means that an object can be locked in a large number of unique modes. The combination of these techniques results in a distributed locking scheme that supports fine-grained concurrency control with low memory and message overhead and with the assurance that their locking system is correct and avoids unnecessary deadlocks. 相似文献

2.

Failure-Atomic File Access in the Slice Interposed Network Storage System

Darrell Anderson Jeff Chase 《Cluster computing》2002,5(4):411-419

This paper presents a recovery protocol for block I/O operations in Slice, a storage system architecture for high-speed LANs incorporating network-attached block storage. The goal of the Slice architecture is to provide a network file service with scalable bandwidth and capacity while preserving compatibility with off-the-shelf clients and file server appliances. The Slice prototype virtualizes the Network File System (NFS) protocol by interposing a request switching filter at the client's interface to the network storage system. The distributed Slice architecture separates functions typically combined in central file servers, introducing new challenges for failure atomicity. This paper presents a protocol for atomic file operations and recovery in the Slice architecture, and related support for reliable file storage using mirrored striping. Experimental results from the Slice prototype show that the protocol has low cost in the common case, allowing the system to deliver client file access bandwidths approaching gigabit-per-second network speeds. 相似文献

3.

Adaptive hybrid storage systems leveraging SSDs and HDDs in HPC cloud environments

Donghun?Koo Jik-Soo?Kim Soonwook?Hwang Hyeonsang?Eom Jaehwan?Lee Email author View author&#;s OrcID profile 《Cluster computing》2017,20(3):2119-2131

Cloud computing should inherently support various types of data-intensive workloads with different storage access patterns. This makes a high-performance storage system in the Cloud an important component. Emerging flash device technologies such as solid state drives (SSDs) are a viable choice for building high performance computing (HPC) cloud storage systems to address more fine-grained data access patterns. However, the bit-per-dollar SSD price is still higher than the prices of HDDs. This study proposes an optimized progressive file layout (PFL) method to leverage the advantages of SSDs in a parallel file system such as Lustre so that small file I/O performance can be significantly improved. A PFL can dynamically adjust chunk sizes and stripe patterns according to various I/O traffics. Extensive experimental results show that this approach (i.e. building a hybrid storage system based on a combination of SSDs and HDDs) can actually achieve balanced throughput over mixed I/O workloads consisting of large and small file access patterns. 相似文献

4.

Cost-intelligent application-specific data layout optimization for parallel file systems

Huaiming Song Yanlong Yin Yong Chen Xian-He Sun 《Cluster computing》2013,16(2):285-298

Parallel file systems have been developed in recent years to ease the I/O bottleneck of high-end computing system. These advanced file systems offer several data layout strategies in order to meet the performance goals of specific I/O workloads. However, while a layout policy may perform well on some I/O workload, it may not perform as well for another. Peak I/O performance is rarely achieved due to the complex data access patterns. Data access is application dependent. In this study, a cost-intelligent data access strategy based on the application-specific optimization principle is proposed. This strategy improves the I/O performance of parallel file systems. We first present examples to illustrate the difference of performance under different data layouts. By developing a cost model which estimates the completion time of data accesses in various data layouts, the layout can better match the application. Static layout optimization can be used for applications with dominant data access patterns, and dynamic layout selection with hybrid replications can be used for applications with complex I/O patterns. Theoretical analysis and experimental testing have been conducted to verify the proposed cost-intelligent layout approach. Analytical and experimental results show that the proposed cost model is effective and the application-specific data layout approach can provide up to a 74% performance improvement for data-intensive applications. 相似文献

5.

Distributed File System Virtualization Techniques Supporting On-Demand Virtual Machine Environments for Grid Computing

Ming Zhao Jian Zhang Renato J. Figueiredo 《Cluster computing》2006,9(1):45-56

This paper presents a data management solution which allows fast Virtual Machine (VM) instantiation and efficient run-time execution to support VMs as execution environments in Grid computing. It is based on novel distributed file system virtualization techniques and is unique in that: (1) it provides on-demand cross-domain access to VM state for unmodified VM monitors; (2) it enables private file system channels for VM instantiation by secure tunneling and session-key based authentication; (3) it supports user-level and write-back disk caches, per-application caching policies and middleware-driven consistency models; and (4) it leverages application-specific meta-data associated with files to expedite data transfers. The paper reports on its performance in wide-area setups using VMware-based VMs. Results show that the solution delivers performance over 30% better than native NFS and with warm caches it can bring the application-perceived overheads below 10% compared to a local-disk setup. The solution also allows a VM with 1.6 GB virtual disk and 320 MB virtual memory to be cloned within 160 seconds for the first clone and within 25 seconds for subsequent clones. Ming Zhao is a PhD candidate in the department of Electrical and Computer Engineering and a member of the Advance Computing and Information Systems Laboratory, at University of Florida. He received the degrees of BE and ME from Tsinghua University. His research interests are in the areas of computer architecture, operating systems and distributed computing. Jian Zhang is a PhD student in the Department of Electrical and Computer Engineering at University of Florida and a member of the Advance Computing and Information Systems Laboratory (ACIS). Her research interest is in virtual machines and Grid computing. She is a member of the IEEE and the ACM. Renato J. Figueiredo received the B.S. and M.S. degrees in Electrical Engineering from the Universidade de Campinas in 1994 and 1995, respectively, and the Ph.D. degree in Electrical and Computer Engineering from Purdue University in 2001. From 2001 until 2002 he was on the faculty of the School of Electrical and Computer Engineering of Northwestern University at Evanston, Illinois. In 2002 he joined the Department of Electrical and Computer Engineering of the University of Florida as an Assistant Professor. His research interests are in the areas of computer architecture, operating systems, and distributed systems. 相似文献

6.

Seamless Access to Decentralized Storage Services in Computational Grids via a Virtual File System

Renato J. Figueiredo Nirav Kapadia José A.B. Fortes 《Cluster computing》2004,7(2):113-122

This paper describes a novel technique for establishing a virtual file system that allows data to be transferred user-transparently and on-demand across computing and storage servers of a computational grid. Its implementation is based on extensions to the Network File System (NFS) that are encapsulated in software proxies. A key differentiator between this approach and previous work is the way in which file servers are partitioned: while conventional file systems share a single (logical) server across multiple users, the virtual file system employs multiple proxy servers that are created, customized and terminated dynamically, for the duration of a computing session, on a per-user basis. Furthermore, the solution does not require modifications to standard NFS clients and servers. The described approach has been deployed in the context of the PUNCH network-computing infrastructure, and is unique in its ability to integrate unmodified, interactive applications (even commercial ones) and existing computing infrastructure into a network computing environment. Experimental results show that: (1) the virtual file system performs well in comparison to native NFS in a local-area setup, with mean overheads of 1 and 18%, for the single-client execution of the Andrew benchmark in two representative computing environments, (2) the average overhead for eight clients can be reduced to within 1% of native NFS with the use of concurrent proxies, (3) the wide-area performance is within 1% of the local-area performance for a typical compute-intensive PUNCH application (SimpleScalar), while for the I/O-intensive application Andrew the wide-area performance is 5.5 times worse than the local-area performance. 相似文献

7.

新时期医学科研档案管理的特点和问题

下载免费PDF全文

孟若娟周志红李鹍仰曙芬《现代生物医学进展》2011,11(8):1576-1578

随着医学科学研究的迅速发展,医学科研档案管理实践中暴露出了一系列新的问题和矛盾,本文从医学科研档案的特点和现代医学研究发展的要求出发,对医学科研档案管理体制、管理模式、管理技术、全员档案意识、技术平台建设以及档案的开发利用等方面进行了深入的探讨。目的是增强档案意识,提高科研档案管理水平,更好地为医学科研活动服务。相似文献

8.

新时期医学科研档案管理的特点和问题

孟若娟周志红李鹍仰曙芬《生物磁学》2011,(8):1576-1578

随着医学科学研究的迅速发展,医学科研档案管理实践中暴露出了一系列新的问题和矛盾,本文从医学科研档案的特点和现代医学研究发展的要求出发,对医学科研档案管理体制、管理模式、管理技术、全员档案意识、技术平台建设以及档案的开发利用等方面进行了深入的探讨。目的是增强档案意识,提高科研档案管理水平,更好地为医学科研活动服务。相似文献

9.

Discretionary Caching for I/O on Clusters

Murali Vilayannur Anand Sivasubramaniam Mahmut Kandemir Rajeev Thakur Robert Ross 《Cluster computing》2006,9(1):29-44

I/O bottlenecks are already a problem in many large-scale applications that manipulate huge datasets. This problem is expected to get worse as applications get larger, and the I/O subsystem performance lags behind processor and memory speed improvements. At the same time, off-the-shelf clusters of workstations are becoming a popular platform for demanding applications due to their cost-effectiveness and widespread deployment. Caching I/O blocks is one effective way of alleviating disk latencies, and there can be multiple levels of caching on a cluster of workstations. Previous studies have shown the benefits of caching—whether it be local to a particular node, or a shared global cache across the cluster—for certain applications. However, we show that while caching is useful in some situations, it can hurt performance if we are not careful about what to cache and when to bypass the cache. This paper presents compilation techniques and runtime support to address this problem. These techniques are implemented and evaluated on an experimental Linux/Pentium cluster running a parallel file system. Our results using a diverse set of applications (scientific and commercial) demonstrate the benefits of a discretionary approach to caching for I/O subsystems on clusters, providing as much as 48% savings in overall execution time over indiscriminately caching everything in some applications. Parts of this paper have appeared in the Proceedings of the 3rd IEEE/ACM Symposium on Cluster Computing and the Grid (CCGrid'03). This paper is an extension of these prior results, and includes a more extensive performance evaluation. Murali Vilayannur is a Ph.D. student in the Department of Computer Science and Engineering at The Pennsylvania State University. His research interests are in High-Performance Parallel I/O, File Systems, Virtual Memory Algorithms and Operating Systems. Anand Sivasubramaniam received his B.Tech. in Computer Science from the Indian Institute of Technology, Madras, in 1989, and the M.S. and Ph.D. degrees in Computer Science from the Georgia Institute of Technology in 1991 and 1995 respectively. He has been on the faculty at The Pennsylvania State University since Fall 1995 where he is currently an Associate Professor. Anand's research interests are in computer architecture, operating systems, performance evaluation, and applications for both high performance computer systems and embedded systems. Anand's research has been funded by NSF through several grants, including the CAREER award, and from industries including IBM, Microsoft and Unisys Corp. He has several publications in leading journals and conferences, and is on the editorial board of IEEE Transactions on Computers and IEEE Transactions on Parallel and Distributed Systems. He is a recipient of the 2002 IBM Faculty Award. Anand is a member of the IEEE, IEEE Computer Society, and ACM. Mahmut Kandemir received the B.Sc. and M.Sc. degrees in control and computer engineering from Istanbul Technical University, Istanbul, Turkey, in 1988 and 1992, respectively. He received the Ph.D. from Syracuse University, Syracuse, New York in electrical engineering and computer science, in 1999. He has been an assistant professor in the Computer Science and Engineering Department at the Pennsylvania State University since August 1999. His main research interests are optimizing compilers, I/O intensive applications, and power-aware computing. He is a member of the IEEE and the ACM. Rajeev Thakur is a Computer Scientist in the Mathematics and Computer Science Division at Argonne National Laboratory. He received a B.E. from the University of Bombay, India, in 1990, M.S. from Syracuse University in 1992, and Ph.D. from Syracuse University in 1995, all in computer engineering. His research interests are in the area of high-performance computing in general and high-performance networking and I/O in particular. He was a member of the MPI Forum and participated actively in the definition of the I/O part of the MPI-2 standard. He is the author of a widely used, portable implementation of MPI-IO, called ROMIO. He is also a co-author of the book “Using MPI-2: Advanced Features of the Message Passing Interface” published by MIT Press. Robert Ross received his Ph.D. in Computer Engineering from Clemson University in 2000. He is now an Assistant Scientist in the Mathematics and Computer Science Division at Argonne National Laboratory. His research interests are in message passing and storage systems for high performance computing environments. He is the primary author and lead developer for the Parallel Virtual File System (PVFS), a parallel file system for Linux clusters. Current projects include the ROMIO MPI-IO implementation, PVFS, PVFS2, and the MPICH2 implementation of the MPI message passing interface. 相似文献

10.

An SCI-Based PC Cluster Utilizing Coherent Network Cache

Sang-Hwa Chung Soo-Cheol Oh 《Cluster computing》2003,6(2):153-159

It is extremely important to minimize network access time in constructing a high-performance PC cluster system. For an SCI-based PC cluster, it is possible to reduce the network access time by maintaining network cache in each cluster node. This paper presents a Network-Cache-Coherent-NUMA (NCC-NUMA) card that utilizes network cache for SCI-based PC clustering. The NCC-NUMA card is directly plugged into the PCI slot of each node, and contains shared memory, network cache, and interconnection modules. The network cache is maintained for the shared memory on the PCI bus of cluster nodes. The coherency mechanism between the network cache and the shared memory is based on the IEEE SCI standard. Both a simulator and an NCC-NUMA prototype card are developed to evaluate the performance of the system. According to the experiments, the cluster system with the NCC-NUMA card showed considerable improvements compared with an SCI-based cluster without network cache. 相似文献

11.

BioTorrents: A File Sharing Service for Scientific Data

Morgan G. I. Langille Jonathan A. Eisen 《PloS one》2010,5(4)

The transfer of scientific data has emerged as a significant challenge, as datasets continue to grow in size and demand for open access sharing increases. Current methods for file transfer do not scale well for large files and can cause long transfer times. In this study we present BioTorrents, a website that allows open access sharing of scientific data and uses the popular BitTorrent peer-to-peer file sharing technology. BioTorrents allows files to be transferred rapidly due to the sharing of bandwidth across multiple institutions and provides more reliable file transfers due to the built-in error checking of the file sharing technology. BioTorrents contains multiple features, including keyword searching, category browsing, RSS feeds, torrent comments, and a discussion forum. BioTorrents is available at http://www.biotorrents.net. 相似文献

12.

Exploiting redundancy to boost performance in a RAID-10 style cluster-based file system

Yifeng Zhu Hong Jiang Xiao Qin Dan Feng David R. Swanson 《Cluster computing》2006,9(4):433-447

While aggregating the throughput of existing disks on cluster nodes is a cost-effective approach to alleviate the I/O bottleneck in cluster computing, this approach suffers from potential performance degradations due to contentions for shared resources on the same node between storage data processing and user task computation. This paper proposes to judiciously utilize the storage redundancy in the form of mirroring existed in a RAID-10 style file system to alleviate this performance degradation. More specifically, a heuristic scheduling algorithm is developed, motivated from the observations of a simple cluster configuration, to spatially schedule write operations on the nodes with less load among each mirroring pair. The duplication of modified data to the mirroring nodes is performed asynchronously in the background. The read performance is improved by two techniques: doubling the degree of parallelism and hot-spot skipping. A synthetic benchmark is used to evaluate these algorithms in a real cluster environment and the proposed algorithms are shown to be very effective in performance enhancement. Yifeng Zhu received his B.Sc. degree in Electrical Engineering in 1998 from Huazhong University of Science and Technology, Wuhan, China; the M.S. and Ph.D. degree in Computer Science from University of Nebraska – Lincoln in 2002 and 2005 respectively. He is an assistant professor in the Electrical and Computer Engineering department at University of Maine. His main research interests are cluster computing, grid computing, computer architecture and systems, and parallel I/O storage systems. Dr. Zhu is a Member of ACM, IEEE, the IEEE Computer Society, and the Francis Crowe Society. Hong Jiang received the B.Sc. degree in Computer Engineering in 1982 from Huazhong University of Science and Technology, Wuhan, China; the M.A.Sc. degree in Computer Engineering in 1987 from the University of Toronto, Toronto, Canada; and the PhD degree in Computer Science in 1991 from the Texas A&M University, College Station, Texas, USA. Since August 1991 he has been at the University of Nebraska-Lincoln, Lincoln, Nebraska, USA, where he is Professor and Vice Chair in the Department of Computer Science and Engineering. His present research interests are computer architecture, parallel/distributed computing, cluster and Grid computing, computer storage systems and parallel I/O, performance evaluation, real-time systems, middleware, and distributed systems for distance education. He has over 100 publications in major journals and international Conferences in these areas and his research has been supported by NSF, DOD and the State of Nebraska. Dr. Jiang is a Member of ACM, the IEEE Computer Society, and the ACM SIGARCH. Xiao Qin received the BS and MS degrees in computer science from Huazhong University of Science and Technology in 1992 and 1999, respectively. He received the PhD degree in computer science from the University of Nebraska-Lincoln in 2004. Currently, he is an assistant professor in the department of computer science at the New Mexico Institute of Mining and Technology. He had served as a subject area editor of IEEE Distributed System Online (2000–2001). His research interests are in parallel and distributed systems, storage systems, real-time computing, performance evaluation, and fault-tolerance. He is a member of the IEEE. Dan Feng received the Ph.D degree from Huazhong University of Science and Technology, Wuhan, China, in 1997. She is currently a professor of School of Computer, Huazhong University of Science and Technology, Wuhan, China. She is the principal scientist of the the National Grand Fundamental Research 973 Program of China “Research on the organization and key technologies of the Storage System on the next generation Internet.” Her research interests include computer architecture, storage system, parallel I/O, massive storage and performance evaluation. David Swanson received a Ph.D. in physical (computational) chemistry at the University of Nebraska-Lincoln (UNL) in 1995, after which he worked as an NSF-NATO postdoctoral fellow at the Technical University of Wroclaw, Poland, in 1996, and subsequently as a National Research Council Research Associate at the Naval Research Laboratory in Washington, DC, from 1997–1998. In 1999 he returned to UNL where he directs the Research Computing Facility and currently serves as an Assistant Research Professor in the Department of Computer Science and Engineering. The Office of Naval Research, the National Science Foundation, and the State of Nebraska have supported his research in areas such as large-scale scientific simulation and distributed systems. 相似文献

13.

Parallel tandem: a program for parallel processing of tandem mass spectra using PVM or MPI and X!Tandem

Duncan DT Craig R Link AJ 《Journal of proteome research》2005,4(5):1842-1847

A method for the rapid correlation of tandem mass spectra to a list of protein sequences in a database has been developed. The combination of the fast and accurate computational search algorithm, X!Tandem, and a Linux cluster parallel computing environment with PVM or MPI, significantly reduces the time required to perform the correlation of tandem mass spectra to protein sequences in a database. A file of tandem mass spectra is divided into a specified number of files, each containing an equal number of the spectra from the larger file. These files are then searched in parallel against a protein sequence database. The results of each parallel output file are collated into one file for viewing through a web interface. Thousands of spectra can be searched in an accurate, practical, and time effective manner. The source code for running Parallel Tandem utilizing either PVM or MPI on Linux operating system is available from http://www.thegpm.org. This source code is made available under Artistic License from the authors. 相似文献

14.

Performance Evaluation of the Quadrics Interconnection Network 总被引：1，自引：0，他引：1

Fabrizio Petrini Eitan Frachtenberg Adolfy Hoisie Salvador Coll 《Cluster computing》2003,6(2):125-142

相似文献

15.

MRPack: Multi-Algorithm Execution Using Compute-Intensive Approach in MapReduce

Muhammad Idris Shujaat Hussain Muhammad Hameed Siddiqi Waseem Hassan Hafiz Syed Muhammad Bilal Sungyoung Lee 《PloS one》2015,10(8)

Large quantities of data have been generated from multiple sources at exponential rates in the last few years. These data are generated at high velocity as real time and streaming data in variety of formats. These characteristics give rise to challenges in its modeling, computation, and processing. Hadoop MapReduce (MR) is a well known data-intensive distributed processing framework using the distributed file system (DFS) for Big Data. Current implementations of MR only support execution of a single algorithm in the entire Hadoop cluster. In this paper, we propose MapReducePack (MRPack), a variation of MR that supports execution of a set of related algorithms in a single MR job. We exploit the computational capability of a cluster by increasing the compute-intensiveness of MapReduce while maintaining its data-intensive approach. It uses the available computing resources by dynamically managing the task assignment and intermediate data. Intermediate data from multiple algorithms are managed using multi-key and skew mitigation strategies. The performance study of the proposed system shows that it is time, I/O, and memory efficient compared to the default MapReduce. The proposed approach reduces the execution time by 200% with an approximate 50% decrease in I/O cost. Complexity and qualitative results analysis shows significant performance improvement. 相似文献

16.

棺头蟋属四种蟋蟀音锉形态的比较(直翅目:蟋蟀科) 总被引：9，自引：4，他引：5

谢令德郑哲民《动物分类学报》2001,26(4):448-453

利用扫描电镜观察并描述了直翅目Orthoptera蟋蟀科Gryllidae棺头蟋属Loxoblemmus 4种常见昆虫即石首棺头蟋Loxoblemmus equestris尖角棺头蟋,L.angulatus,哈尼棺头蟋L.haani,多伊棺头蟋L.doenitzi的音锉和音齿形态特征,观察描述中,首次使用拐角,音齿分布,音齿左,右端等形态特征,并详细地给予了描述,为头蟋属乃至蟋蟀科种类鉴定提供了新的分类依据。相似文献

17.

Performance Analysis of a Myrinet-Based Cluster

Teddy Surya Gunawan Wentong Cai 《Cluster computing》2003,6(4):299-313

In recent years, there has been a growing interest in the cluster system as an accepted form of supercomputing, due to its high performance at an affordable cost. This paper attempts to elaborate performance analysis of Myrinet-based cluster. The communication performance and effect of background load on parallel applications were analyzed. For point-to-point communication, it was found that an extension to the Hockney's model was required to estimate the performance. The proposed model suggested that there should be two ranges to be used for the performance metrics to cope with the cache effect. Moreover, based on the extension of the point-to-point communication model, the Xu and Hwang's model for collective communication performance was also extended. Results showed that our models can make better estimation of the communication performance than the previous models. Finally, the interference of other user processes to the cluster system is evaluated by using synthetic background load generation programs. 相似文献

18.

Adaptive Sector Grouping to Reduce False Sharing in Distributed RAID 总被引：1，自引：0，他引：1

Hai Jin Kai Hwang 《Cluster computing》2001,4(2):133-143

Distributed redundant array of inexpensive disks (RAID) is often embedded in a cluster architecture. In a centralized RAID subsystem, the false sharing problem does not exist, because the disk array allows only mutually exclusive access by one user at a time. However, the problem does exist in a distributed RAID architecture, because multiple accesses may occur simultaneously in a distributed environment. This problem will seriously limit the effectiveness of collective I/O operations in network-based, cluster computing. Traditional accesses to disks in a RAID are done at block level. The block granularity is large, say 32 KB, often resulting in false sharing among fragments in the block. The false sharing problem becomes worse when the block size or the stripe unit becomes too large. To solve this problem, we propose an adaptive sector grouping approach to accessing a distributed RAID. Each sector has a fine grain of 512 B. Multiple sectors are grouped together to match with the data block size. The grouped sector has a variable size that can be adaptively adjusted by software. Benchmark experiments reveal the positive effects of this adaptive access scheme on the performance of a RAID. Our scheme can reduce the collective I/O access time without increasing the buffer size. Both theoretical analysis and experimental results demonstrate the performance gain in using grouped sectors for fast access of distributed RAID. 相似文献

19.

Implementing noncollective parallel I/O in cluster environments using Active Message communication

Jarek Nieplocha Holger Dachsel Ian Foster 《Cluster computing》1999,2(4):271-279

A costeffective secondary storage architecture for parallel computers is to distribute storage across all processors, which then engage in either computation or I/O, depending on the demands of the moment. A difficulty associated with this architecture is that access to storage on another processor typically requires the cooperation of that processor, which can be hard to arrange if the processor is engaged in other computation. One partial solution to this problem is to require that remote I/O operations occur only via collective calls. In this paper, we describe an alternative approach based on the use of singlesided communication operations such as Active Messages. We present an implementation of this basic approach called Distant I/O and present experimental results that quantify the lowlevel performance of DIO mechanisms. This technique is exploited to support noncollective parallel shared file model for a large outofcore scientific application with very high I/O bandwidth requirements. The achieved performance exceeds by a wide margin the performance of a well equipped PIOFS parallel filesystem on the IBM SP. 相似文献

20.

A parallel programming interface for out-of-core cluster applications

Jianqi Tang Binxing Fang Mingzeng Hu Hongli Zhang 《Cluster computing》2006,9(3):321-327

Clusters of workstations are a practical approach to parallel computing that provide high performance at a low cost for many scientific and engineering applications. In order to handle problems with increasing data sets, methods supporting parallel out-of-core computations must be investigated. Since writing an out-of-core version of a program is a difficult task and virtual memory systems do not perform well in some cases, we have developed a parallel programming interface and the support library to provide efficient and convenient access to the out-of-core data. This paper focuses on how these components extend the range of problem sizes that can be solved on the cluster of workstations. Execution time of Jacobi iteration when using our interface, virtual memory and PVFS are compared to characterize the performance for various problem sizes, and it is concluded that our new interface significantly increases the sizes of problems that can be efficiently solved. Jianqi Tang received B.Sc. and M.Sc. from Harbin Institute of Technology in 1997 and 1999 respectively, both in computer application. Currently, she is a Ph.D. candidate at the Department of Computer Science and engineering, Harbin Institute of Technology. She has participated in several National research projects. Her research interests include parallel computing, parallel I/O and grid computing. Binxing Fang received M.Sc. in 1984 from Tsinghua University and Ph.D. from Harbin Institute of Technology in 1989, both in computer science. From 1990 to 1993 he was with National University of Defense Technology as a postdoctor. Since 1984, he is a faculty member at the Department of Computer Science and engineering of Harbin Institute of Technology, where he is presently a Professor. He is a Member of the National Information Expert Consultant Group and a Standing Member of the Council of Chinese Society of Communications. His research efforts focus on parallel computing, computer network and information security. Professor Fang has implemented over 30 projects from the state and ministry/province. Mingzeng Hu was born in 1935. He has been with the Department of Computer Science and engineering in Harbin Institute of Technology since 1958, where he is currently a Professor. He was a visiting scholar in the Siemens Company, Germany from 1978 to 1979, a visiting associate professor in Chiba University, Japan from 1984 to 1985, and a visiting professor in York University, Canada from 1989 to 1995. He is the Director of the National Key Laboratory of Computer Information Content Security. He is also a Member of 3rd Academic Degree Committee under the State Council of China. Professor Hu’s research interests include high performance computer architecture and parallel processing technology, fault tolerant computing, network system, VL design, and computer system security technology. He has implemented many projects from the state and ministry/province and has won several Ministry Science and Technology Progress Awards. He published over 100 papers in core journals home and abroad and one book. Professor Hu has supervised over 20 doctoral students. Hongli Zhang received M.Sc in computer system software in 1996 and Ph.D. in computer architecture in 1999 from Harbin Institute of Technology. Currently, she is an Associate Professor at the Department of Computer Science and engineering, Harbin Institute of Technology. Her research interests include computer network security and parallel computing. 相似文献