Exact Scheduling to Minimize Off-Chip Data Movement for Deep Learning Accelerators

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Specialized hardware accelerators are increasingly utilized to provide performance/power efficiency for Deep Neural Network (DNN) applications. However their benefits are limited by expensive off-chip data movement between host memory and the accelerator's on-chip scratchpad, which can consume significantly more energy than accelerator computation [13]. While application-level DNN operators can have arbitrary sizes, accelerators typically support fixed-sized operations due to constrained on-chip memory and micro-architectures. Consequently, mapping an application-level operator to an accelerator involves decomposing it into loops of smaller tiles. Different choices of tile sizes, loop orders and memory partition across tensors result in a vast design space with huge differences in off-chip data movement volume. To address this challenge, we introduce Shoehorn, a schedule optimization framework that jointly optimizes loop tiling, loop ordering, and memory partitioning for mapping application-level DNN operators to hardware accelerators. Shoehorn can generate optimal schedules in subseconds and outperforms state-of-the-art approaches, reducing up to 51% total off-chip memory traffic relative to competing schedulers for several widely-used DNN applications on three distinct hardware accelerator targets.

Original languageEnglish (US)
Title of host publicationASP-DAC 2024 - 29th Asia and South Pacific Design Automation Conference, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages908-914
Number of pages7
ISBN (Electronic)9798350393545
DOIs
StatePublished - 2024
Event29th Asia and South Pacific Design Automation Conference, ASP-DAC 2024 - Incheon, Korea, Republic of
Duration: Jan 22 2024Jan 25 2024

Publication series

NameProceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC

Conference

Conference29th Asia and South Pacific Design Automation Conference, ASP-DAC 2024
Country/TerritoryKorea, Republic of
CityIncheon
Period1/22/241/25/24

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'Exact Scheduling to Minimize Off-Chip Data Movement for Deep Learning Accelerators'. Together they form a unique fingerprint.

Cite this