Exploiting redundancy to boost performance in a RAID-10 style cluster-based file system |
| |
Authors: | Yifeng Zhu Hong Jiang Xiao Qin Dan Feng David R Swanson |
| |
Institution: | (1) Electrical and Computer Engineering, University of Maine, Orono, ME 04469, USA;(2) Department of Computer Science and Engineering, University of Nebraska, Lincoln, NE 68588, USA;(3) Department of Computer Science, New Mexico Institute of Mining and Technology, Mexico, 87801, USA;(4) Department of Computer Science, Huazhong University of Science and Technology, Wuhan, China |
| |
Abstract: | While aggregating the throughput of existing disks on cluster nodes is a cost-effective approach to alleviate the I/O bottleneck
in cluster computing, this approach suffers from potential performance degradations due to contentions for shared resources
on the same node between storage data processing and user task computation. This paper proposes to judiciously utilize the
storage redundancy in the form of mirroring existed in a RAID-10 style file system to alleviate this performance degradation.
More specifically, a heuristic scheduling algorithm is developed, motivated from the observations of a simple cluster configuration,
to spatially schedule write operations on the nodes with less load among each mirroring pair. The duplication of modified
data to the mirroring nodes is performed asynchronously in the background. The read performance is improved by two techniques:
doubling the degree of parallelism and hot-spot skipping. A synthetic benchmark is used to evaluate these algorithms in a
real cluster environment and the proposed algorithms are shown to be very effective in performance enhancement.
Yifeng Zhu received his B.Sc. degree in Electrical Engineering in 1998 from Huazhong University of Science and Technology, Wuhan, China;
the M.S. and Ph.D. degree in Computer Science from University of Nebraska – Lincoln in 2002 and 2005 respectively. He is an
assistant professor in the Electrical and Computer Engineering department at University of Maine. His main research interests
are cluster computing, grid computing, computer architecture and systems, and parallel I/O storage systems. Dr. Zhu is a Member
of ACM, IEEE, the IEEE Computer Society, and the Francis Crowe Society.
Hong Jiang received the B.Sc. degree in Computer Engineering in 1982 from Huazhong University of Science and Technology, Wuhan, China;
the M.A.Sc. degree in Computer Engineering in 1987 from the University of Toronto, Toronto, Canada; and the PhD degree in
Computer Science in 1991 from the Texas A&M University, College Station, Texas, USA. Since August 1991 he has been at the
University of Nebraska-Lincoln, Lincoln, Nebraska, USA, where he is Professor and Vice Chair in the Department of Computer
Science and Engineering. His present research interests are computer architecture, parallel/distributed computing, cluster
and Grid computing, computer storage systems and parallel I/O, performance evaluation, real-time systems, middleware, and
distributed systems for distance education. He has over 100 publications in major journals and international Conferences in
these areas and his research has been supported by NSF, DOD and the State of Nebraska. Dr. Jiang is a Member of ACM, the IEEE
Computer Society, and the ACM SIGARCH.
Xiao Qin received the BS and MS degrees in computer science from Huazhong University of Science and Technology in 1992 and 1999, respectively.
He received the PhD degree in computer science from the University of Nebraska-Lincoln in 2004. Currently, he is an assistant
professor in the department of computer science at the New Mexico Institute of Mining and Technology. He had served as a subject
area editor of IEEE Distributed System Online (2000–2001). His research interests are in parallel and distributed systems, storage systems, real-time computing, performance
evaluation, and fault-tolerance. He is a member of the IEEE.
Dan Feng received the Ph.D degree from Huazhong University of Science and Technology, Wuhan, China, in 1997. She is currently a professor
of School of Computer, Huazhong University of Science and Technology, Wuhan, China. She is the principal scientist of the
the National Grand Fundamental Research 973 Program of China “Research on the organization and key technologies of the Storage
System on the next generation Internet.” Her research interests include computer architecture, storage system, parallel I/O,
massive storage and performance evaluation.
David Swanson received a Ph.D. in physical (computational) chemistry at the University of Nebraska-Lincoln (UNL) in 1995, after which he
worked as an NSF-NATO postdoctoral fellow at the Technical University of Wroclaw, Poland, in 1996, and subsequently as a National
Research Council Research Associate at the Naval Research Laboratory in Washington, DC, from 1997–1998. In 1999 he returned
to UNL where he directs the Research Computing Facility and currently serves as an Assistant Research Professor in the Department
of Computer Science and Engineering. The Office of Naval Research, the National Science Foundation, and the State of Nebraska
have supported his research in areas such as large-scale scientific simulation and distributed systems. |
| |
Keywords: | CEFT PVFS Cluster computing Data storage Cluster file systems Redundancy RAID |
本文献已被 SpringerLink 等数据库收录! |
|