TY - JOUR
T1 - Neural Network Training with Stochastic Hardware Models and Software Abstractions
AU - Zhang, Bonan
AU - Chen, Lung Yen
AU - Verma, Naveen
N1 - Funding Information:
This work was supported by the Air Force Research Laboratory (AFRL) and Defense Advanced Research Projects Agency (DARPA) under Agreement FA8650-18-2-7866.
Publisher Copyright:
© 2004-2012 IEEE.
PY - 2021/4
Y1 - 2021/4
N2 - Machine learning inference is of broad interest, increasingly in energy-constrained applications. However, platforms are often pushed to their energy limits, especially with deep learning models, which provide state-of-the-art inference performance but are also computationally intensive. This has motivated algorithmic co-design, where flexibility in the model and model parameters, derived from training, is exploited for hardware energy efficiency. This work extends a model-training algorithm referred to as Stochastic Data-Driven Hardware Resilience (S-DDHR) to enable statistical models of computations, amenable for energy/throughput aggressive hardware operating points as well as emerging variation-prone device technologies. S-DDHR itself extends the previous approach of DDHR by incorporating the statistical distribution of hardware variations for model-parameter learning, rather than a sample of the distributions. This is critical to developing accurate and composable abstractions of computations, to enable scalable hardware-generalized training, rather than hardware instance-by-instance training. S-DDHR is demonstrated and evaluated for a bit-scalable MRAM-based in-memory computing architecture, whose energy/throughput trade-offs explicitly motivate statistical computations. Using foundry data to model MRAM device variations, S-DDHR is shown to preserve high inference performance for benchmark datasets (MNIST, CIFAR-10, SVHN) as variation parameters are scaled to high levels, exhibiting less than 3.5% accuracy drop at 10 × the nominal variation level.
AB - Machine learning inference is of broad interest, increasingly in energy-constrained applications. However, platforms are often pushed to their energy limits, especially with deep learning models, which provide state-of-the-art inference performance but are also computationally intensive. This has motivated algorithmic co-design, where flexibility in the model and model parameters, derived from training, is exploited for hardware energy efficiency. This work extends a model-training algorithm referred to as Stochastic Data-Driven Hardware Resilience (S-DDHR) to enable statistical models of computations, amenable for energy/throughput aggressive hardware operating points as well as emerging variation-prone device technologies. S-DDHR itself extends the previous approach of DDHR by incorporating the statistical distribution of hardware variations for model-parameter learning, rather than a sample of the distributions. This is critical to developing accurate and composable abstractions of computations, to enable scalable hardware-generalized training, rather than hardware instance-by-instance training. S-DDHR is demonstrated and evaluated for a bit-scalable MRAM-based in-memory computing architecture, whose energy/throughput trade-offs explicitly motivate statistical computations. Using foundry data to model MRAM device variations, S-DDHR is shown to preserve high inference performance for benchmark datasets (MNIST, CIFAR-10, SVHN) as variation parameters are scaled to high levels, exhibiting less than 3.5% accuracy drop at 10 × the nominal variation level.
KW - Statistical computing
KW - circuit reliability
KW - deep learning
KW - fault tolerance
KW - in-memory computing
UR - http://www.scopus.com/inward/record.url?scp=85100467536&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100467536&partnerID=8YFLogxK
U2 - 10.1109/TCSI.2021.3052981
DO - 10.1109/TCSI.2021.3052981
M3 - Article
AN - SCOPUS:85100467536
SN - 1549-8328
VL - 68
SP - 1532
EP - 1542
JO - IEEE Transactions on Circuits and Systems I: Regular Papers
JF - IEEE Transactions on Circuits and Systems I: Regular Papers
IS - 4
M1 - 9336298
ER -