Towards memory-efficient inference in edge video analytics

Arthi Padmanabhan, Anand Padmanabha Iyer, Ganesh Ananthanarayanan, Yuanchao Shu, Nikolaos Karianakis, Guoqing Harry Xu, Ravi Netravali

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Video analytics pipelines incorporate on-premise edge servers to lower analysis latency, ensure privacy, and reduce bandwidth requirements. However, compared to the cloud, edge servers typically have lower processing power and GPU memory, limiting the number of video streams that they can manage and analyze. Existing solutions for memory management, such as swapping models in and out of GPU, having a common model stem, or compression and quantization to reduce the model size incur high overheads and often provide limited benefits. In this paper, we propose model merging as an approach towards memory management at the edge. This proposal is based on our observation that models at the edge share common layers, and that merging these common layers across models can result in significant memory savings. Our preliminary evaluation indicates that such an approach could result in up to 75% savings in the memory requirements. We conclude by discussing several challenges involved with realizing the model merging vision.

Original languageEnglish (US)
Title of host publicationHotEdgeVideo 2021 - Proceedings of the 2021 3rd ACM Workshop on Hot Topics in Video Analytics and Intelligent Edges
PublisherAssociation for Computing Machinery, Inc
Pages31-37
Number of pages7
ISBN (Electronic)9781450387002
DOIs
StatePublished - Oct 25 2021
Event3rd ACM Workshop on Hot Topics in Video Analytics and Intelligent Edges, HotEdgeVideo 2021 - New Orleans, United States
Duration: Oct 25 2021 → …

Publication series

NameHotEdgeVideo 2021 - Proceedings of the 2021 3rd ACM Workshop on Hot Topics in Video Analytics and Intelligent Edges

Conference

Conference3rd ACM Workshop on Hot Topics in Video Analytics and Intelligent Edges, HotEdgeVideo 2021
Country/TerritoryUnited States
CityNew Orleans
Period10/25/21 → …

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Media Technology

Keywords

  • deep neural networks
  • video analytics

Fingerprint

Dive into the research topics of 'Towards memory-efficient inference in edge video analytics'. Together they form a unique fingerprint.

Cite this