Abstract
In this paper we examine how application performance scales on a state-of-the-art shared virtual memory (SVM) system on a cluster with 64 processors, comprising 4-way SMPs connected with a fast system area network. The protocol we use is home-based and takes advantage of general-purpose data movement and mutual exclusion support provided by a programmable network interface. We find that while the level of application restructuring needed is quite high compared to applications that perform well on a hardware-coherent system of this scale, and larger problem sizes are needed for good performance, SVM, surprisingly, performs quite well at the 64-processor scale for a fairly wide range of applications, achieving at least half the parallel efficiency of a high-end hardware-coherent system and often much more. We explore further application restructurings than those developed earlier for smaller-scale SVM systems, examine the main remaining system and application bottlenecks, and point out directions for future research.
Original language | English (US) |
---|---|
Pages | 165-174 |
Number of pages | 10 |
DOIs | |
State | Published - 1999 |
Event | Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99 - Rhodes, Greece Duration: Jun 20 1999 → Jun 25 1999 |
Other
Other | Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99 |
---|---|
City | Rhodes, Greece |
Period | 6/20/99 → 6/25/99 |
All Science Journal Classification (ASJC) codes
- General Computer Science