TY - GEN
T1 - Automatic speculative DOALL for clusters
AU - Kim, Hanjun
AU - Johnson, Nick P.
AU - Lee, Jae W.
AU - Mahlke, Scott A.
AU - August, David I.
PY - 2012
Y1 - 2012
N2 - Automatic parallelization for clusters is a promising alternative to time-consuming, error-prone manual parallelization. However, automatic parallelization is frequently limited by the imprecision of static analysis. Moreover, due to the inherent fragility of static analysis, small changes to the source code can significantly undermine performance. By replacing static analysis with speculation and profiling, automatic parallelization becomes more robust and applicable. A naïve automatic speculative parallelization does not scale for distributed memory clusters, due to the high bandwidth required to validate speculation. This work is the frit automatic speculative DOALL (Spec-DOALL) parallelization system for clusters. We have implemented a prototype automatic parallelization system, called Cluster Spec-DOALL, which consists of a Spec-DOALL parallelizing compiler and a speculative runtime for clusters. Since the compiler optimizes communication patterns, and the runtime is optimized for the cases in which speculation succeeds, Cluster Spec-DOALL minimizes the communication and validation overheads of the speculative runtime. Across 8 benchmarks, Cluster Spec-DOALL achieves a geomean speedup of 43.8× on a 120- core cluster, whereas DOALL without speculation achieves only 4.5× speedup. This demonstrates that speculation makes scalable fully-automatic parallelization for clusters possible.
AB - Automatic parallelization for clusters is a promising alternative to time-consuming, error-prone manual parallelization. However, automatic parallelization is frequently limited by the imprecision of static analysis. Moreover, due to the inherent fragility of static analysis, small changes to the source code can significantly undermine performance. By replacing static analysis with speculation and profiling, automatic parallelization becomes more robust and applicable. A naïve automatic speculative parallelization does not scale for distributed memory clusters, due to the high bandwidth required to validate speculation. This work is the frit automatic speculative DOALL (Spec-DOALL) parallelization system for clusters. We have implemented a prototype automatic parallelization system, called Cluster Spec-DOALL, which consists of a Spec-DOALL parallelizing compiler and a speculative runtime for clusters. Since the compiler optimizes communication patterns, and the runtime is optimized for the cases in which speculation succeeds, Cluster Spec-DOALL minimizes the communication and validation overheads of the speculative runtime. Across 8 benchmarks, Cluster Spec-DOALL achieves a geomean speedup of 43.8× on a 120- core cluster, whereas DOALL without speculation achieves only 4.5× speedup. This demonstrates that speculation makes scalable fully-automatic parallelization for clusters possible.
UR - http://www.scopus.com/inward/record.url?scp=84863500114&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84863500114&partnerID=8YFLogxK
U2 - 10.1145/2259016.2259029
DO - 10.1145/2259016.2259029
M3 - Conference contribution
AN - SCOPUS:84863500114
SN - 9781605586359
T3 - Proceedings - International Symposium on Code Generation and Optimization, CGO 2012
SP - 94
EP - 103
BT - Proceedings - International Symposium on Code Generation and Optimization, CGO 2012
T2 - 10th International Symposium on Code Generation and Optimization, CGO 2012
Y2 - 31 March 2012 through 4 April 2012
ER -