Efficient data supply for parallel heterogeneous architectures

Tae Jun Ham, Juan L. Aragón, Margaret Rose Martonosi

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Decoupling techniques have been proposed to reduce the amount of memory latency exposed to high-performance accelerators as they fetch data. Although decoupled access-execute (DAE) and more recent decoupled data supply approaches offer promising single-threaded performance improvements, little work has considered how to extend them into parallel scenarios. This article explores the opportunities and challenges of designing parallel, high-performance, resource-efficient decoupled data supply systems. We propose Mercury, a parallel decoupled data supply system that utilizes thread-level parallelism for high-throughput data supply with good portability attributes. Additionally, we introduce some microarchitectural improvements for data supply units to efficiently handle long-latency indirect loads.

Original languageEnglish (US)
Article number9
JournalACM Transactions on Architecture and Code Optimization
Issue number2
StatePublished - Apr 2019

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems
  • Hardware and Architecture


  • Data access optimization
  • Decoupled architecture
  • Heterogeneous architecture


Dive into the research topics of 'Efficient data supply for parallel heterogeneous architectures'. Together they form a unique fingerprint.

Cite this