Abstract
In today's computers, heterogeneous processing is used to meet performance targets at manageable power. In adopting increased compute specialization, however, the relative amount of time spent on communication increases. System and software optimizations for communication often come at the costs of increased complexity and reduced portability. The Decoupled Supply-Compute (DeSC) approach offers a way to attack communication latency bottlenecks automatically, while maintaining good portability and low complexity. Our work expands prior Decoupled Access Execute techniques with hardware/software specialization. For a range of workloads, DeSC offers roughly 2× speedup, and additional specialized compression optimizations reduce traffic between decoupled units by 40%.
Original language | English (US) |
---|---|
Article number | 16 |
Journal | ACM Transactions on Architecture and Code Optimization |
Volume | 14 |
Issue number | 2 |
DOIs | |
State | Published - Jun 2017 |
All Science Journal Classification (ASJC) codes
- Software
- Information Systems
- Hardware and Architecture
Keywords
- Accelerators
- Communication management
- Decoupled architecture