TY - GEN
T1 - Thread criticality predictors for dynamic performance, power, and resource management in chip multiprocessors
AU - Bhattacharjee, Abhishek
AU - Martonosi, Margaret Rose
PY - 2009
Y1 - 2009
N2 - With the shift towards chip multiprocessors (CMPs), exploiting and managing parallelism has become a central problem in computer systems. Many issues of parallelism management boil down to discerning which running threads or processes are critical, or slowest, versus which are non-critical. If one can accurately predict critical threads in a parallel program, then one can respond in a variety of ways. Possibilities include running the critical thread at a faster clock rate, performing load balancing techniques to offload work onto currently non-critical threads, or giving the critical thread more on-chip resources to execute faster. This paper proposes and evaluates simple but effective thread criticality predictors for parallel applications. We show that accurate predictors can be built using counters that are typically already available on-chip. Our predictor, based on memory hierarchy statistics, identifies thread crit-icality with an average accuracy of 93% across a range of architectures. We also demonstrate two applications of our predictor. First, we show how Intel's Threading Building Blocks (TBB) parallel runtime system can benefit from task stealing techniques that use our criticality predictor to reduce load imbalance. Using criticality prediction to guide TBB's task-stealing decisions improves performance by 13-32% for TBB-based PARSEC benchmarks running on a 32-core CMP. As a second application, criticality prediction guides dynamic energy optimizations in barrier-based applications. By running the predicted critical thread at the full clock rate and frequency-scaling non-critical threads, this approach achieves average energy savings of 15% while negligibly degrading performance for SPLASH-2 and PARSEC benchmarks.
AB - With the shift towards chip multiprocessors (CMPs), exploiting and managing parallelism has become a central problem in computer systems. Many issues of parallelism management boil down to discerning which running threads or processes are critical, or slowest, versus which are non-critical. If one can accurately predict critical threads in a parallel program, then one can respond in a variety of ways. Possibilities include running the critical thread at a faster clock rate, performing load balancing techniques to offload work onto currently non-critical threads, or giving the critical thread more on-chip resources to execute faster. This paper proposes and evaluates simple but effective thread criticality predictors for parallel applications. We show that accurate predictors can be built using counters that are typically already available on-chip. Our predictor, based on memory hierarchy statistics, identifies thread crit-icality with an average accuracy of 93% across a range of architectures. We also demonstrate two applications of our predictor. First, we show how Intel's Threading Building Blocks (TBB) parallel runtime system can benefit from task stealing techniques that use our criticality predictor to reduce load imbalance. Using criticality prediction to guide TBB's task-stealing decisions improves performance by 13-32% for TBB-based PARSEC benchmarks running on a 32-core CMP. As a second application, criticality prediction guides dynamic energy optimizations in barrier-based applications. By running the predicted critical thread at the full clock rate and frequency-scaling non-critical threads, this approach achieves average energy savings of 15% while negligibly degrading performance for SPLASH-2 and PARSEC benchmarks.
KW - Caches
KW - DVFS
KW - Intel TBB
KW - Parallel processing
KW - Thread criticality prediction
UR - http://www.scopus.com/inward/record.url?scp=70450245578&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70450245578&partnerID=8YFLogxK
U2 - 10.1145/1555754.1555792
DO - 10.1145/1555754.1555792
M3 - Conference contribution
AN - SCOPUS:70450245578
SN - 9781605585260
T3 - Proceedings - International Symposium on Computer Architecture
SP - 290
EP - 301
BT - ISCA 2009 - 36th Annual International Symposium on Computer Architecture, Conference Proceedings
T2 - ISCA 2009 - 36th Annual International Symposium on Computer Architecture
Y2 - 20 June 2009 through 24 June 2009
ER -