TY - GEN
T1 - CABLE
T2 - 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018
AU - Nguyen, Tri
AU - Fuchs, Adi
AU - Wentzlaff, David
N1 - Funding Information:
This material is based on research sponsored by the NSF under Grants No. CNS-1823222 and CCF-1438980, AFOSR under Grant No. FA9550-14-1-0148, Air Force Research Laboratory (AFRL) and Defense Advanced Research Projects Agency (DARPA) under agreement No. FA8650-18-2-7846 and FA8650-18-2-7852.The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Air Force Research Laboratory (AFRL) and Defense Advanced Research Projects Agency (DARPA), the NSF, AFOSR, DARPA, or the U.S. Government.
Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/12
Y1 - 2018/12/12
N2 - Off-chip bandwidth is a scarce resource in modern processors, and it is expected to become even more limited on a per-core basis as we move into the era of high-Throughput and massively-parallel computation. One promising approach to overcome limited bandwidth is off-chip link compression. Unfortunately, previously proposed latency-driven compression schemes are not a good fit for latency-Tolerant manycore systems, and they often do not have the dictionary capacity to accommodate more than a few concurrent threads. In this work, we present CABLE, a novel CAche-Based Link Encoder that enables point-To-point link compression between coherent caches, re-purposing the data already stored in the caches as a massive and scalable dictionary for data compression. We show the broad applicability of CABLE by applying it to two critical off-chip links: (1) the memory link interface to off-chip memory, and (2) the cache-coherent link between processors in a multi-chip system. We have implemented CABLE's search pipeline hardware in Verilog using the OpenPiton framework to show its feasibility. Evaluating with SPEC2006, we find that CABLE increases effective off-chip bandwidth by 7.2x and system throughput by 3.78x on average, 83% and 258% better than CPACK, respectively.
AB - Off-chip bandwidth is a scarce resource in modern processors, and it is expected to become even more limited on a per-core basis as we move into the era of high-Throughput and massively-parallel computation. One promising approach to overcome limited bandwidth is off-chip link compression. Unfortunately, previously proposed latency-driven compression schemes are not a good fit for latency-Tolerant manycore systems, and they often do not have the dictionary capacity to accommodate more than a few concurrent threads. In this work, we present CABLE, a novel CAche-Based Link Encoder that enables point-To-point link compression between coherent caches, re-purposing the data already stored in the caches as a massive and scalable dictionary for data compression. We show the broad applicability of CABLE by applying it to two critical off-chip links: (1) the memory link interface to off-chip memory, and (2) the cache-coherent link between processors in a multi-chip system. We have implemented CABLE's search pipeline hardware in Verilog using the OpenPiton framework to show its feasibility. Evaluating with SPEC2006, we find that CABLE increases effective off-chip bandwidth by 7.2x and system throughput by 3.78x on average, 83% and 258% better than CPACK, respectively.
KW - Cache memory
KW - Data compression
KW - Parallel processing
UR - http://www.scopus.com/inward/record.url?scp=85060017744&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85060017744&partnerID=8YFLogxK
U2 - 10.1109/MICRO.2018.00033
DO - 10.1109/MICRO.2018.00033
M3 - Conference contribution
AN - SCOPUS:85060017744
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 312
EP - 325
BT - Proceedings - 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018
PB - IEEE Computer Society
Y2 - 20 October 2018 through 24 October 2018
ER -