TY - JOUR
T1 - Application Restructuring and Performance Portability on Shared Virtual Memory and Hardware-Coherent Multiprocessors
AU - Jiang, Dongming
AU - Shan, Hongzhang
AU - Singh, Jaswinder Pal
PY - 1997/7
Y1 - 1997/7
N2 - The performance portability of parallel programs across a wide range of emerging coherent shared address space systems is not well understood. Programs that run well on efficient, hardware cache-coherent systems often do not perform well on less optimal or more commodity-based communication architectures. This paper studies this issue of performance portability, with the commodity communication architecture of interest being page-grained shared virtual memory. We begin with applications that perform well on moderate-scale hardware cache-coherent systems, and find that they do not do so well on SVM systems. Then, we examine whether and how the applications can be improved for SVM systems - through data structuring or algorithmic enhancements - and the nature and difficulty of the optimizations. Finally, we examine the impact of the successful optimizations on hardware-coherent platforms themselves, to see whether they are helpful, harmful or neutral on those platforms. We develop a systematic methodology to explore optimizations in different structured classes. The results, and the difficulty of the optimizations, lead insight not only into performance portability but also into the viability of SVM as a platform for these types of applications.
AB - The performance portability of parallel programs across a wide range of emerging coherent shared address space systems is not well understood. Programs that run well on efficient, hardware cache-coherent systems often do not perform well on less optimal or more commodity-based communication architectures. This paper studies this issue of performance portability, with the commodity communication architecture of interest being page-grained shared virtual memory. We begin with applications that perform well on moderate-scale hardware cache-coherent systems, and find that they do not do so well on SVM systems. Then, we examine whether and how the applications can be improved for SVM systems - through data structuring or algorithmic enhancements - and the nature and difficulty of the optimizations. Finally, we examine the impact of the successful optimizations on hardware-coherent platforms themselves, to see whether they are helpful, harmful or neutral on those platforms. We develop a systematic methodology to explore optimizations in different structured classes. The results, and the difficulty of the optimizations, lead insight not only into performance portability but also into the viability of SVM as a platform for these types of applications.
UR - http://www.scopus.com/inward/record.url?scp=0347306290&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0347306290&partnerID=8YFLogxK
U2 - 10.1145/263767.263792
DO - 10.1145/263767.263792
M3 - Article
AN - SCOPUS:0347306290
SN - 0362-1340
VL - 32
SP - 217
EP - 229
JO - SIGPLAN Notices (ACM Special Interest Group on Programming Languages)
JF - SIGPLAN Notices (ACM Special Interest Group on Programming Languages)
IS - 7
ER -