TY - GEN
T1 - Statistical computing framework and demonstration for in-memory computing systems
AU - Zhang, Bonan
AU - Deaville, Peter
AU - Verma, Naveen
N1 - Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/7/10
Y1 - 2022/7/10
N2 - With the increasing importance of data-intensive workloads, such as AI, in-memory computing (IMC) has demonstrated substantial energy/throughput benefits by addressing both compute and data-movement/accessing costs, and holds significant further promise by its ability to leverage emerging forms of highly-scaled memory technologies. However, IMC fundamentally derives its advantages through parallelism, which poses a trade-off with SNR, whereby variations and noise in nanoscaled devices directly limit possible gains. In this work, we propose novel training approaches to improve model tolerance to noise via a contrastive loss function and a progressive training procedure. We further propose a methodology for modeling and calibrating hardware noise, efficiently at the level of a macro operation and through a limited number of hardware measurements. The approaches are demonstrated on a fabricated MRAM-based IMC prototype in 22nm FD-SOI, together with a neural network training framework implemented in PyTorch. For CIFAR-10/100 classifications, model performance is restored to the level of ideal noise-free execution, and generalized performance of the trained model deployed across different chips is demonstrated.
AB - With the increasing importance of data-intensive workloads, such as AI, in-memory computing (IMC) has demonstrated substantial energy/throughput benefits by addressing both compute and data-movement/accessing costs, and holds significant further promise by its ability to leverage emerging forms of highly-scaled memory technologies. However, IMC fundamentally derives its advantages through parallelism, which poses a trade-off with SNR, whereby variations and noise in nanoscaled devices directly limit possible gains. In this work, we propose novel training approaches to improve model tolerance to noise via a contrastive loss function and a progressive training procedure. We further propose a methodology for modeling and calibrating hardware noise, efficiently at the level of a macro operation and through a limited number of hardware measurements. The approaches are demonstrated on a fabricated MRAM-based IMC prototype in 22nm FD-SOI, together with a neural network training framework implemented in PyTorch. For CIFAR-10/100 classifications, model performance is restored to the level of ideal noise-free execution, and generalized performance of the trained model deployed across different chips is demonstrated.
KW - MRAM
KW - deep learning
KW - in-memory computing
KW - statistical computing
UR - http://www.scopus.com/inward/record.url?scp=85137492034&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137492034&partnerID=8YFLogxK
U2 - 10.1145/3489517.3530557
DO - 10.1145/3489517.3530557
M3 - Conference contribution
AN - SCOPUS:85137492034
T3 - Proceedings - Design Automation Conference
SP - 979
EP - 984
BT - Proceedings of the 59th ACM/IEEE Design Automation Conference, DAC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 59th ACM/IEEE Design Automation Conference, DAC 2022
Y2 - 10 July 2022 through 14 July 2022
ER -