TY - JOUR
T1 - Efficient data supply for parallel heterogeneous architectures
AU - Ham, Tae Jun
AU - Aragón, Juan L.
AU - Martonosi, Margaret Rose
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2019/4
Y1 - 2019/4
N2 - Decoupling techniques have been proposed to reduce the amount of memory latency exposed to high-performance accelerators as they fetch data. Although decoupled access-execute (DAE) and more recent decoupled data supply approaches offer promising single-threaded performance improvements, little work has considered how to extend them into parallel scenarios. This article explores the opportunities and challenges of designing parallel, high-performance, resource-efficient decoupled data supply systems. We propose Mercury, a parallel decoupled data supply system that utilizes thread-level parallelism for high-throughput data supply with good portability attributes. Additionally, we introduce some microarchitectural improvements for data supply units to efficiently handle long-latency indirect loads.
AB - Decoupling techniques have been proposed to reduce the amount of memory latency exposed to high-performance accelerators as they fetch data. Although decoupled access-execute (DAE) and more recent decoupled data supply approaches offer promising single-threaded performance improvements, little work has considered how to extend them into parallel scenarios. This article explores the opportunities and challenges of designing parallel, high-performance, resource-efficient decoupled data supply systems. We propose Mercury, a parallel decoupled data supply system that utilizes thread-level parallelism for high-throughput data supply with good portability attributes. Additionally, we introduce some microarchitectural improvements for data supply units to efficiently handle long-latency indirect loads.
KW - Data access optimization
KW - Decoupled architecture
KW - Heterogeneous architecture
UR - http://www.scopus.com/inward/record.url?scp=85065730130&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065730130&partnerID=8YFLogxK
U2 - 10.1145/3310332
DO - 10.1145/3310332
M3 - Article
AN - SCOPUS:85065730130
SN - 1544-3566
VL - 16
JO - ACM Transactions on Architecture and Code Optimization
JF - ACM Transactions on Architecture and Code Optimization
IS - 2
M1 - 9
ER -