TY - JOUR
T1 - Shared virtual memory clusters
T2 - Bridging the cost-performance gap between SMPs and hardware DSM systems
AU - Bilas, Angelos
AU - Jiang, Dongming
AU - Singh, Jaswinder Pal
N1 - Funding Information:
The authors thankfully acknowledge the help of Brian O’Kelley, Xiang Yu, and Sanjeev Kumar with earlier versions of this work. We thank Hongzhang Shan for making available to us the improved version of Barnes. We thank Yuqun Chen for initially porting VMMC to WindowsNT, Scott Karlin for helping debug the hardware during the port, and the staff members of the Computer Science Department for their help with cumbersome task of managing the system. We also thank the members of the PRISM group at Princeton, in particular Liviu Iftode, Kai Li, Rudrajit Samanta, Limin Wang, and Yuany uan Zhou for useful discussions. We gratefully acknowledge the support of NSF, DARPA, and NSERC.
PY - 2003/12
Y1 - 2003/12
N2 - Although the shared memory abstraction is gaining ground as a programming abstraction for parallel computing, the main platforms that support it, small-scale symmetric multiprocessors (SMPs) and hardware cache-coherent distributed shared memory systems (DSMs), seem to lie inherently at the extremes of the cost-performance spectrum for parallel systems. In this paper we examine if shared virtual memory (SVM) clusters can bridge this gap by examining how application performance scales on a state-of-the-art shared virtual memory cluster. We find that: (i) The level of application restructuring needed is quite high compared to applications that perform well on a DSM system of the same scale and larger problem sizes are needed for good performance. (ii) However, surprisingly, SVM performs quite well for a fairly wide range of applications, achieving at least half the parallel efficiency of a high-end DSM system at the same scale and often much more.
AB - Although the shared memory abstraction is gaining ground as a programming abstraction for parallel computing, the main platforms that support it, small-scale symmetric multiprocessors (SMPs) and hardware cache-coherent distributed shared memory systems (DSMs), seem to lie inherently at the extremes of the cost-performance spectrum for parallel systems. In this paper we examine if shared virtual memory (SVM) clusters can bridge this gap by examining how application performance scales on a state-of-the-art shared virtual memory cluster. We find that: (i) The level of application restructuring needed is quite high compared to applications that perform well on a DSM system of the same scale and larger problem sizes are needed for good performance. (ii) However, surprisingly, SVM performs quite well for a fairly wide range of applications, achieving at least half the parallel efficiency of a high-end DSM system at the same scale and often much more.
KW - Clusters
KW - Parallel applications
KW - Scalability
KW - Shared virtual memory
KW - System area networks
UR - http://www.scopus.com/inward/record.url?scp=0347534203&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0347534203&partnerID=8YFLogxK
U2 - 10.1016/j.jpdc.2003.08.001
DO - 10.1016/j.jpdc.2003.08.001
M3 - Article
AN - SCOPUS:0347534203
SN - 0743-7315
VL - 63
SP - 1257
EP - 1276
JO - Journal of Parallel and Distributed Computing
JF - Journal of Parallel and Distributed Computing
IS - 12
ER -