TY - JOUR
T1 - Scaling Up In-Memory-Computing Classifiers via Boosted Feature Subsets in Banked Architectures
AU - Tang, Yinqi
AU - Zhang, Jintao
AU - Verma, Naveen
N1 - Funding Information:
Manuscript received March 13, 2018; revised June 3, 2018; accepted July 4, 2018. Date of publication July 10, 2018; date of current version February 26, 2019. This work was supported by NSF under Grant CCF-1253670. This brief was recommended by Associate Editor Y. Liu. (Corresponding author: Yinqi Tang.) The authors are with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA (e-mail: yinqit@princeton.edu; jintao@princeton.edu; nverma@princeton.edu).
Publisher Copyright:
© 2004-2012 IEEE.
PY - 2019/3
Y1 - 2019/3
N2 - In-memory computing is an emerging approach for overcoming memory-accessing bottlenecks, by eliminating the costs of explicitly moving data from point of storage to point of computation outside the array. However, computation increases the dynamic range of signals, such that performing it via the existing structure of dense memory substantially squeezes the signal-to-noise ratio (SNR). In this brief, we explore how computations can be scaled up, to jointly optimize energy/latency/bandwidth gains with SNR requirements. We employ algorithmic techniques to decompose computations so that they can be mapped to multiple parallel memory banks operating at chosen optimal points. Specifically focusing on in-memory classification, we consider a custom IC in 130-nm CMOS IC and demonstrate an algorithm combining error-adaptive classifier boosting and multi-armed bandits, to enable segmentation of a feature vector into multiple subsets. The measured performance of 10-way MNIST digit classification, using images downsampled to 16×16 pixels (mapped across four separate banks), is 91%, close to that simulated using full unsegmented feature vectors. The energy per classification is 879.7 pJ, 14.3× lower than that of a system based on separated memory and digital accelerator.
AB - In-memory computing is an emerging approach for overcoming memory-accessing bottlenecks, by eliminating the costs of explicitly moving data from point of storage to point of computation outside the array. However, computation increases the dynamic range of signals, such that performing it via the existing structure of dense memory substantially squeezes the signal-to-noise ratio (SNR). In this brief, we explore how computations can be scaled up, to jointly optimize energy/latency/bandwidth gains with SNR requirements. We employ algorithmic techniques to decompose computations so that they can be mapped to multiple parallel memory banks operating at chosen optimal points. Specifically focusing on in-memory classification, we consider a custom IC in 130-nm CMOS IC and demonstrate an algorithm combining error-adaptive classifier boosting and multi-armed bandits, to enable segmentation of a feature vector into multiple subsets. The measured performance of 10-way MNIST digit classification, using images downsampled to 16×16 pixels (mapped across four separate banks), is 91%, close to that simulated using full unsegmented feature vectors. The energy per classification is 879.7 pJ, 14.3× lower than that of a system based on separated memory and digital accelerator.
KW - Boosting
KW - feature segmentation
KW - in-memory computing
KW - machine learning
KW - multi-armed bandits
UR - http://www.scopus.com/inward/record.url?scp=85049798280&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85049798280&partnerID=8YFLogxK
U2 - 10.1109/TCSII.2018.2854759
DO - 10.1109/TCSII.2018.2854759
M3 - Article
AN - SCOPUS:85049798280
SN - 1549-7747
VL - 66
SP - 477
EP - 481
JO - IEEE Transactions on Circuits and Systems II: Express Briefs
JF - IEEE Transactions on Circuits and Systems II: Express Briefs
IS - 3
M1 - 8409339
ER -