TY - GEN
T1 - PriME
T2 - 2014 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2014
AU - Fu, Yaosheng
AU - Wentzlaff, David
N1 - Copyright:
Copyright 2014 Elsevier B.V., All rights reserved.
PY - 2014
Y1 - 2014
N2 - Modern processors are integrating an increasing number of cores, which brings new design challenges. However, mainstream architectural simulators primarily focus on unicore or multicore systems with small core counts. In order to simulate emerging manycore architectures, more simulators designed for thousand-core systems are needed. In this paper, we introduce the Princeton Manycore Executor (PriME), a parallelized, MPI-based, x86 manycore simulator. The primary goal of PriME is to provide high performance simulation for manycore architectures, allowing fast exploration of architectural ideas including cache hierarchies, coherence protocols and NoCs in thousand-core systems. PriME supports multi-threaded workloads as well as multi-programmed workloads. Furthermore, it parallelizes the simulation of multi-threaded workloads inside of a host machine and parallelizes the simulation of multi-programmed workloads across multiple host machines by utilizing MPI to communicate between different simulator modules. By using two levels of parallelization (within a host machine and across host machines), PriME can improve simulation performance which is especially useful when simulating thousands of cores. Prime is especially adept at executing simulations which have large memory requirements as previous simulators which use multiple machines are unable to simulate more memory than is available in any single host machine. We demonstrate PriME simulating 2000+ core machines and show near-linear scaling on up to 108 host processors split across 9 machines. Finally we validate PriME against a real-world 40-core machine and show the average error to be 12%.
AB - Modern processors are integrating an increasing number of cores, which brings new design challenges. However, mainstream architectural simulators primarily focus on unicore or multicore systems with small core counts. In order to simulate emerging manycore architectures, more simulators designed for thousand-core systems are needed. In this paper, we introduce the Princeton Manycore Executor (PriME), a parallelized, MPI-based, x86 manycore simulator. The primary goal of PriME is to provide high performance simulation for manycore architectures, allowing fast exploration of architectural ideas including cache hierarchies, coherence protocols and NoCs in thousand-core systems. PriME supports multi-threaded workloads as well as multi-programmed workloads. Furthermore, it parallelizes the simulation of multi-threaded workloads inside of a host machine and parallelizes the simulation of multi-programmed workloads across multiple host machines by utilizing MPI to communicate between different simulator modules. By using two levels of parallelization (within a host machine and across host machines), PriME can improve simulation performance which is especially useful when simulating thousands of cores. Prime is especially adept at executing simulations which have large memory requirements as previous simulators which use multiple machines are unable to simulate more memory than is available in any single host machine. We demonstrate PriME simulating 2000+ core machines and show near-linear scaling on up to 108 host processors split across 9 machines. Finally we validate PriME against a real-world 40-core machine and show the average error to be 12%.
UR - http://www.scopus.com/inward/record.url?scp=84904512114&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84904512114&partnerID=8YFLogxK
U2 - 10.1109/ISPASS.2014.6844467
DO - 10.1109/ISPASS.2014.6844467
M3 - Conference contribution
AN - SCOPUS:84904512114
SN - 9781479936052
T3 - ISPASS 2014 - IEEE International Symposium on Performance Analysis of Systems and Software
SP - 116
EP - 125
BT - ISPASS 2014 - IEEE International Symposium on Performance Analysis of Systems and Software
PB - IEEE Computer Society
Y2 - 23 March 2014 through 25 March 2014
ER -