TransCODE: Co-design of Transformers and Accelerators for Efficient Training and Inference

Shikhar Tuli, Niraj K. Jha

Research output: Contribution to journalArticlepeer-review

Abstract

Automated co-design of machine learning models and evaluation hardware is critical for efficiently deploying such models at scale. Despite the state-of-the-art performance of transformer models, they are not yet ready for execution on resource-constrained hardware platforms. High memory requirements and low parallelizability of the transformer architecture exacerbate this problem. Recently-proposed accelerators attempt to optimize the throughput and energy consumption of transformer models. However, such works are either limited to a one-sided search of the model architecture or a restricted set of off-the-shelf devices. Furthermore, previous works only accelerate model inference and not training, which incurs substantially higher memory and compute resources, making the problem even more challenging. To address these limitations, this work proposes a dynamic training framework, called DynaProp, that speeds up the training process and reduces memory consumption. DynaProp is a low-overhead pruning method that prunes activations and gradients at runtime. To effectively execute this method on hardware for a diverse set of transformer architectures, we propose ELECTOR, a framework that simulates transformer inference and training on a design space of accelerators. We use this simulator in conjunction with the proposed co-design technique, called TransCODE, to obtain the best-performing models with high accuracy on the given task and minimize latency, energy consumption, and chip area. The obtained transformer-accelerator pair achieves 0.3% higher accuracy than the state-of-the-art pair while incurring 5.2× lower latency and 3.0× lower energy consumption.

Original languageEnglish (US)
Pages (from-to)1
Number of pages1
JournalIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
DOIs
StateAccepted/In press - 2023

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Graphics and Computer-Aided Design
  • Electrical and Electronic Engineering

Keywords

  • Application-specific integrated circuits
  • Computational modeling
  • Hardware
  • hardware-software co-design
  • Integrated circuit modeling
  • machine learning
  • neural network accelerators
  • Task analysis
  • Training
  • transformers
  • Transformers
  • Uncertainty

Fingerprint

Dive into the research topics of 'TransCODE: Co-design of Transformers and Accelerators for Efficient Training and Inference'. Together they form a unique fingerprint.

Cite this