Abstract
In-memory computing is an emerging approach for overcoming memory-accessing bottlenecks, by eliminating the costs of explicitly moving data from point of storage to point of computation outside the array. However, computation increases the dynamic range of signals, such that performing it via the existing structure of dense memory substantially squeezes the signal-to-noise ratio (SNR). In this brief, we explore how computations can be scaled up, to jointly optimize energy/latency/bandwidth gains with SNR requirements. We employ algorithmic techniques to decompose computations so that they can be mapped to multiple parallel memory banks operating at chosen optimal points. Specifically focusing on in-memory classification, we consider a custom IC in 130-nm CMOS IC and demonstrate an algorithm combining error-adaptive classifier boosting and multi-armed bandits, to enable segmentation of a feature vector into multiple subsets. The measured performance of 10-way MNIST digit classification, using images downsampled to 16×16 pixels (mapped across four separate banks), is 91%, close to that simulated using full unsegmented feature vectors. The energy per classification is 879.7 pJ, 14.3× lower than that of a system based on separated memory and digital accelerator.
Original language | English (US) |
---|---|
Article number | 8409339 |
Pages (from-to) | 477-481 |
Number of pages | 5 |
Journal | IEEE Transactions on Circuits and Systems II: Express Briefs |
Volume | 66 |
Issue number | 3 |
DOIs | |
State | Published - Mar 2019 |
All Science Journal Classification (ASJC) codes
- Electrical and Electronic Engineering
Keywords
- Boosting
- feature segmentation
- in-memory computing
- machine learning
- multi-armed bandits