TransCODE: Co-Design of Transformers and Accelerators for Efficient Training and Inference

Shikhar Tuli, Niraj K. Jha

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Automated co-design of machine learning models and evaluation hardware is critical for efficiently deploying such models at scale. Despite the state-of-the-art performance of transformer models, they are not yet ready for execution on resource-constrained hardware platforms. High memory requirements and low parallelizability of the transformer architecture exacerbate this problem. Recently proposed accelerators attempt to optimize the throughput and energy consumption of transformer models. However, such works are either limited to a one-sided search of the model architecture or a restricted set of off-the-shelf devices. Furthermore, previous works only accelerate model inference and not training, which incurs substantially higher memory and compute resources, making the problem even more challenging. To address these limitations, this work proposes a dynamic training framework, called DynaProp, that speeds up the training process and reduces memory consumption. DynaProp is a low-overhead pruning method that prunes activations and gradients at runtime. To effectively execute this method on hardware for a diverse set of transformer architectures, we propose a flexible BERT accelerator, a framework that simulates transformer inference and training on a design space of accelerators. We use this simulator in conjunction with the proposed co-design technique, called TransCODE, to obtain the best-performing models with high accuracy on the given task and minimize latency, energy consumption, and chip area. The obtained transformer-accelerator pair achieves 0.3% higher accuracy than the state-of-the-art pair while incurring 5.2× lower latency and 3.0× lower energy consumption.

Original languageEnglish (US)
Pages (from-to)4817-4830
Number of pages14
JournalIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Volume42
Issue number12
DOIs
StatePublished - Dec 1 2023
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Electrical and Electronic Engineering
  • Computer Graphics and Computer-Aided Design

Keywords

  • Application-specific integrated circuits (ASICs)
  • hardware software co-design
  • machine learning
  • neural network accelerators
  • transformers

Fingerprint

Dive into the research topics of 'TransCODE: Co-Design of Transformers and Accelerators for Efficient Training and Inference'. Together they form a unique fingerprint.

Cite this