TY - GEN
T1 - Estimating application hierarchical bandwidth requirements using BSP family models
AU - Soviani, Adrian
AU - Singh, Jaswinder Pal
PY - 2012
Y1 - 2012
N2 - There has been a vast amount of work to develop programming models that provide good performance across machine architectures, are easy to use, and have predictable performance. Similarly, the design and optimization of architectures to achieve optimal performance for an application class remains a challenging task. Accurate cost modeling is essential for both application development and system design. Many scientific computing codes are developed by using libraries that provide custom-built collective communication primitives. For example, the family of Bulk Synchronous Parallel (BSP) machine models provides suitable tools for analyzing such problems. However, modeling the effect of bandwidth limitations for globally unbalanced communication and estimating the hierarchical bandwidth used by applications remain key challenges. We present a hierarchical bandwidth machine model (alpha DBSP) that naturally extends the Decomposable BSP (DBSP) model by associating a bandwidth growth factor alpha to each message pattern. Algorithms executed on alpha DBSP have a runtime that is at least as good as DBSP. Hence, there are globally unbalanced problems for which alpha DBSP analysis is simpler or more accurate We present three scientific computing kernels that illustrate the differences between alpha DBSP and DBSP analysis. Similar to the BSP family models, alpha DBSP predicts collective communication execution time for a given machine. Additionally, alpha DBSP estimates the hierarchical bandwidth required by a given application. System architects may use this estimation to design machines that avoid bandwidth bottlenecks for their target application class.
AB - There has been a vast amount of work to develop programming models that provide good performance across machine architectures, are easy to use, and have predictable performance. Similarly, the design and optimization of architectures to achieve optimal performance for an application class remains a challenging task. Accurate cost modeling is essential for both application development and system design. Many scientific computing codes are developed by using libraries that provide custom-built collective communication primitives. For example, the family of Bulk Synchronous Parallel (BSP) machine models provides suitable tools for analyzing such problems. However, modeling the effect of bandwidth limitations for globally unbalanced communication and estimating the hierarchical bandwidth used by applications remain key challenges. We present a hierarchical bandwidth machine model (alpha DBSP) that naturally extends the Decomposable BSP (DBSP) model by associating a bandwidth growth factor alpha to each message pattern. Algorithms executed on alpha DBSP have a runtime that is at least as good as DBSP. Hence, there are globally unbalanced problems for which alpha DBSP analysis is simpler or more accurate We present three scientific computing kernels that illustrate the differences between alpha DBSP and DBSP analysis. Similar to the BSP family models, alpha DBSP predicts collective communication execution time for a given machine. Additionally, alpha DBSP estimates the hierarchical bandwidth required by a given application. System architects may use this estimation to design machines that avoid bandwidth bottlenecks for their target application class.
KW - BSP
KW - Collective Communication
KW - DBSP
KW - Interconnect Topology
KW - Performance Modeling
UR - http://www.scopus.com/inward/record.url?scp=84867436450&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867436450&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2012.112
DO - 10.1109/IPDPSW.2012.112
M3 - Conference contribution
AN - SCOPUS:84867436450
SN - 9780769546766
T3 - Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
SP - 914
EP - 923
BT - Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
T2 - 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
Y2 - 21 May 2012 through 25 May 2012
ER -