Monitoring shared virtual memory performance on a Myrinet-based PC cluster

Cheng Liao, Dongming Jiang, Liviu Iftode, Margaret Martonosi, Douglas W. Clark

Research output: Contribution to conferencePaperpeer-review

3 Scopus citations

Abstract

Network-connected clusters of PCs or workstations are becoming a widespread parallel computing platform. Performance methodologies that use either simulation or high-level software instrumentation cannot adequately measure the detailed behavior of such systems. The availability of new network technologies based on programmable network interfaces opens a new avenue of research in analyzing and improving the performance of software shared memory protocols. We have developed monitoring firmware embedded in the programmable network interfaces of a Myrinet-based PC cluster. Timestamps on network packets facilitate the collection of low-level statistics on, e.g., network latencies, interrupt handler times and inter-node synchronization. This paper describes our use of the low-level software performance monitor to measure and understand the performance of a Shared Virtual Memory (SVM) system implemented on a Myrinet-based cluster, running the SPLASH-2 benchmarks. We measured time spent in various communication stages during the main protocol operations: remote page fetch, remote lock synchronization, and barriers. These data show that remote request contention in the network interface and hosts can serialize their handling and artificially increase the page miss time. This increase then dilates the critical section within which it occurs, increasing lock contention and causing lock serialization. Furthermore, lock serialization is reflected in the waiting time at barriers. These results of our study sharpen and deepen similar but higher-level speculations in previous simulation-based SVM performance research. Moreover, the insights about different layers, including communication architecture, SVM protocol, and applications, on real systems provide guidelines for better designs in those layers.

Original languageEnglish (US)
Pages251-258
Number of pages8
DOIs
StatePublished - 1998
EventProceedings of the 1998 International Conference on Supercomputing - Melbourne, Aust
Duration: Jul 13 1998Jul 17 1998

Other

OtherProceedings of the 1998 International Conference on Supercomputing
CityMelbourne, Aust
Period7/13/987/17/98

All Science Journal Classification (ASJC) codes

  • General Computer Science

Fingerprint

Dive into the research topics of 'Monitoring shared virtual memory performance on a Myrinet-based PC cluster'. Together they form a unique fingerprint.

Cite this