TY - GEN
T1 - Characterizing the TLB behavior of emerging parallelworkloads on chip multiprocessors
AU - Bhattacharjee, Abhishek
AU - Martonosi, Margaret Rose
PY - 2009
Y1 - 2009
N2 - Translation Lookaside Buffers (TLBs) are a staple in modern computer systems and have a significant impact on overall system performance. Numerous prior studies have addressed TLB designs to lower access times and miss rates; these, however, have been targeted towards uniprocessor architectures. As the computer industry embraces chip multiprocessor (CMP) architectures, it is important to study the TLB behavior of emerging parallel workloads. This work presents the first full-system characterization of the TLB behavior of emerging parallel applications on real-system CMPs. Using the PARSEC benchmarks, representative of emerging RMS workloads, we show that TLB misses can hinder system performance significantly. We also evaluate TLB miss stream patterns and show that multiple threads of a parallel execution experience a large number of redundant and predictable misses. For our evaluated benchmarks, 30% to 95% of the total misses fall under this category. Our results point to the need for novel TLB designs encouraging inter-core cooperation, either through hierarchically shared TLBs or through inter-core TLB prediction mechanisms.
AB - Translation Lookaside Buffers (TLBs) are a staple in modern computer systems and have a significant impact on overall system performance. Numerous prior studies have addressed TLB designs to lower access times and miss rates; these, however, have been targeted towards uniprocessor architectures. As the computer industry embraces chip multiprocessor (CMP) architectures, it is important to study the TLB behavior of emerging parallel workloads. This work presents the first full-system characterization of the TLB behavior of emerging parallel applications on real-system CMPs. Using the PARSEC benchmarks, representative of emerging RMS workloads, we show that TLB misses can hinder system performance significantly. We also evaluate TLB miss stream patterns and show that multiple threads of a parallel execution experience a large number of redundant and predictable misses. For our evaluated benchmarks, 30% to 95% of the total misses fall under this category. Our results point to the need for novel TLB designs encouraging inter-core cooperation, either through hierarchically shared TLBs or through inter-core TLB prediction mechanisms.
UR - http://www.scopus.com/inward/record.url?scp=70449652917&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70449652917&partnerID=8YFLogxK
U2 - 10.1109/PACT.2009.26
DO - 10.1109/PACT.2009.26
M3 - Conference contribution
AN - SCOPUS:70449652917
SN - 9780769537719
T3 - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
SP - 29
EP - 40
BT - Proceedings - 2009 18th International Conference on Parallel Architectures and Compilation Techniques, PACT 2009
T2 - 2009 18th International Conference on Parallel Architectures and Compilation Techniques, PACT 2009
Y2 - 12 September 2009 through 16 September 2009
ER -