An In-memory-Computing DNN Achieving 700 TOPS/W and 6 TOPS/mm in 130-nm CMOS

Jintao Zhang, Naveen Verma

Research output: Contribution to journalArticle

2 Scopus citations

Abstract

Deep neural networks (DNNs) are increasingly popular in machine learning and have achieved the state-of-the-art performance in a range of tasks. Typically, the best results are achieved using a large amount of training data and large models, which make both training and inference complex. While GPUs are used in many applications for the parallel computing they provide, lower energy platforms have the potential to enable a range of new applications. A trend being observed is the ability to reduce the precision of weights and activations, with previous research showing that in some cases, weights and activations can be binarized [i.e., binarized neural networks (BNNs)], significantly reducing the model size. Exploiting this toward reduced compute energy and reduced data-movement energy, we demonstrate the BNN mapped to a previously presented in-memory-computing architecture, where binarized weights are stored in a standard 6T SRAM bit cell and computations are performed via an analog operation. Using a reduced size BNN, chosen to fit on the CMOS prototype (in 130 nm), MNIST classification is achieved with only 0.4% accuracy degradation (from 94%), but at 26\times lower energy compared to a digital approach implementing the same network. The system reaches over 700-TOPS/W energy efficiency and 6-TOPS/mm throughput.

Original languageEnglish (US)
Article number8695076
Pages (from-to)358-366
Number of pages9
JournalIEEE Journal on Emerging and Selected Topics in Circuits and Systems
Volume9
Issue number2
DOIs
StatePublished - Jun 2019

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering

Keywords

  • Machine learning
  • binary neural network
  • deep neural network
  • in-memory computing
  • system-on-chip

Fingerprint Dive into the research topics of 'An In-memory-Computing DNN Achieving 700 TOPS/W and 6 TOPS/mm in 130-nm CMOS'. Together they form a unique fingerprint.

  • Cite this