TY - JOUR
T1 - Energy-Efficient Monolithic Three-Dimensional On-Chip Memory Architectures
AU - Yu, Ye
AU - Jha, Niraj K.
N1 - Funding Information:
Manuscript received April 13, 2017; revised May 25, 2017; accepted July 19, 2017. Date of publication July 25, 2017; date of current version July 9, 2018. This work was supported by National Science Foundation under Grant CCF-1318603. The review of this paper was coordinated by Associate Editor M. Rahman. (Corresponding author: Y. Yu.) The authors are with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA (e-mail: yeyu@princeton.edu; jha@ princeton.edu). Digital Object Identifier 10.1109/TNANO.2017.2731871
Publisher Copyright:
© 2002-2012 IEEE.
PY - 2018/7
Y1 - 2018/7
N2 - Memory bandwidth is one of the major performance bottlenecks for chip multiprocessors (CMPs), which continue to integrate an increasing number of cores with the help of Moore's Law. The growing disparity between the CPU clock rate and off-chip memory access speed is known as the Memory Wall. This problem has been actively studied in the past two decades. It is addressed by placing memory closer to the processor, such as stacking the memory directly on top of a CMP, thereby significantly reducing the interconnect latency between them. However, previous 3D-stacked memory architectures use through-silicon via (TSV)-based three-dimensional (3-D) integration, which bonds multiple dies with TSVs that have diameters in the 1-5 $\mu \text{m}$ range. Unlike TSV-based 3-D integration, monolithic 3-D integration builds device tiers sequentially on a single substrate. Different tiers are connected using monolithic inter-tier vias (MIVs), which have a diameter (around 50 $\text{nm}$) that is the same as that of a local via. Main memory typically consists of DRAM, which is volatile and thus requires periodic refresh to maintain the stored data. This increases both the energy consumption and access latency. However, various nonvolatile RAMs (NVRAMs) have emerged as possible universal memory technologies, which promise low power, fast read access, high density, and nonvolatility. In this paper, we present an efficient memory interface for monolithic 3D-stacked RAM (both DRAM and NVRAMs such as resistive RAM and nanotube RAM). It takes advantage of the tremendous bandwidth made available by MIVs to implement an on-chip memory bus in order to hide the latency of large data transfers. We propose a multientry row-based write buffer to increase the buffer hit rate and reduce the number of memory core accesses. We decouple read and write accesses using extra interconnects available through MIVs to increase memory throughput. We also present an adaptive power-down policy to maintain balance between energy efficiency and performance. Simulation results show that the proposed architecture can achieve both high performance and energy efficiency, and is thus attractive for low-power/high-performance computing.
AB - Memory bandwidth is one of the major performance bottlenecks for chip multiprocessors (CMPs), which continue to integrate an increasing number of cores with the help of Moore's Law. The growing disparity between the CPU clock rate and off-chip memory access speed is known as the Memory Wall. This problem has been actively studied in the past two decades. It is addressed by placing memory closer to the processor, such as stacking the memory directly on top of a CMP, thereby significantly reducing the interconnect latency between them. However, previous 3D-stacked memory architectures use through-silicon via (TSV)-based three-dimensional (3-D) integration, which bonds multiple dies with TSVs that have diameters in the 1-5 $\mu \text{m}$ range. Unlike TSV-based 3-D integration, monolithic 3-D integration builds device tiers sequentially on a single substrate. Different tiers are connected using monolithic inter-tier vias (MIVs), which have a diameter (around 50 $\text{nm}$) that is the same as that of a local via. Main memory typically consists of DRAM, which is volatile and thus requires periodic refresh to maintain the stored data. This increases both the energy consumption and access latency. However, various nonvolatile RAMs (NVRAMs) have emerged as possible universal memory technologies, which promise low power, fast read access, high density, and nonvolatility. In this paper, we present an efficient memory interface for monolithic 3D-stacked RAM (both DRAM and NVRAMs such as resistive RAM and nanotube RAM). It takes advantage of the tremendous bandwidth made available by MIVs to implement an on-chip memory bus in order to hide the latency of large data transfers. We propose a multientry row-based write buffer to increase the buffer hit rate and reduce the number of memory core accesses. We decouple read and write accesses using extra interconnects available through MIVs to increase memory throughput. We also present an adaptive power-down policy to maintain balance between energy efficiency and performance. Simulation results show that the proposed architecture can achieve both high performance and energy efficiency, and is thus attractive for low-power/high-performance computing.
KW - Chip multiprocessor (CMP)
KW - Memory Wall
KW - NRAM
KW - RRAM
KW - energy efficiency
KW - monolithic three-dimensional (3-D) integration
KW - nonvolatile RAM (NVRAM)
KW - on-chip memory
UR - http://www.scopus.com/inward/record.url?scp=85029180892&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85029180892&partnerID=8YFLogxK
U2 - 10.1109/TNANO.2017.2731871
DO - 10.1109/TNANO.2017.2731871
M3 - Article
AN - SCOPUS:85029180892
SN - 1536-125X
VL - 17
SP - 620
EP - 633
JO - IEEE Transactions on Nanotechnology
JF - IEEE Transactions on Nanotechnology
IS - 4
ER -