TY - GEN
T1 - Revisiting the sequential programming model for multi-core
AU - Bridges, Matthew J.
AU - Vachharajani, Neil
AU - Zhang, Yun
AU - Jablin, Thomas
AU - August, David I.
PY - 2007
Y1 - 2007
N2 - Single-threaded programming is already considered a complicated task. The move to multi-threaded programming only increases the complexity and cost involved in software development due to rewriting legacy code, training of the programmer, increased debugging of the program, and efforts to avoid race conditions, deadlocks, and other problems associated with parallel programming. To address these costs, other approaches, such as automatic thread extraction, have been explored. Unfortunately, the amount of parallelism that has been automatically extracted is generally insufficient to keep many cores busy. This paper argues that this lack of parallelism is not an intrinsic limitation of the sequential programming model, but rather occurs for two reasons. First, there exists no framework for automatic thread extraction that brings together key existing state-of-the-art compiler and hardware techniques. This paper shows that such a framework can yield scalable parallelization on several SPEC CINT2000 benchmarks. Second, existing sequential programming languages force programmers to define a single legal program outcome, rather than allowing for a range of legal outcomes. This paper shows that natural extensions to the sequential programming model enable parallelization for the remainder of the SPEC CINT2000 suite. Our experience demonstrates that, by changing only 60 source code lines, all of the C benchmarks in the SPEC CINT2000 suite were parallelizable by automatic thread extraction. This process, constrained by the limits of modern optimizing compilers, yielded a speedup of 454% on these applications.
AB - Single-threaded programming is already considered a complicated task. The move to multi-threaded programming only increases the complexity and cost involved in software development due to rewriting legacy code, training of the programmer, increased debugging of the program, and efforts to avoid race conditions, deadlocks, and other problems associated with parallel programming. To address these costs, other approaches, such as automatic thread extraction, have been explored. Unfortunately, the amount of parallelism that has been automatically extracted is generally insufficient to keep many cores busy. This paper argues that this lack of parallelism is not an intrinsic limitation of the sequential programming model, but rather occurs for two reasons. First, there exists no framework for automatic thread extraction that brings together key existing state-of-the-art compiler and hardware techniques. This paper shows that such a framework can yield scalable parallelization on several SPEC CINT2000 benchmarks. Second, existing sequential programming languages force programmers to define a single legal program outcome, rather than allowing for a range of legal outcomes. This paper shows that natural extensions to the sequential programming model enable parallelization for the remainder of the SPEC CINT2000 suite. Our experience demonstrates that, by changing only 60 source code lines, all of the C benchmarks in the SPEC CINT2000 suite were parallelizable by automatic thread extraction. This process, constrained by the limits of modern optimizing compilers, yielded a speedup of 454% on these applications.
UR - http://www.scopus.com/inward/record.url?scp=47349089048&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47349089048&partnerID=8YFLogxK
U2 - 10.1109/MICRO.2007.20
DO - 10.1109/MICRO.2007.20
M3 - Conference contribution
AN - SCOPUS:47349089048
SN - 0769530478
SN - 9780769530475
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 69
EP - 81
BT - Proceedings of the The 40th IEEE/ACM International Symposium on Microarchitecture, MICRO 2007
T2 - 40th IEEE/ACM International Symposium on Microarchitecture, MICRO 2007
Y2 - 1 December 2007 through 5 December 2007
ER -