Communication-Constrained Distributed Learning: TSI-Aided Asynchronous Optimization with Stale Gradient

Siyuan Yu, Wei Chen, H. Vincent Poor

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Distributed machine learning including federated learning has attracted considerable attention due to its potential of scaling the computational resources, reducing the training time, and helping protect the user privacy. As one of key enablers of distributed learning, asynchronous optimization allows multiple workers to process data simultaneously without paying a cost of synchronization delay. However, given limited communication bandwidth, asynchronous optimization can be hampered by gradient staleness, which severely hinders the learning process. In this paper, we present a communication-constrained distributed learning scheme, in which asynchronous stochastic gradients generated by parallel workers are transmitted over a shared medium or link. Our aim is to minimize the average training time by striking the optimal tradeoff between the number of parallel workers and their gradient staleness. To this end, a queueing theoretic model is formulated, which allows us to find the optimal number of workers participating in the asynchronous optimization. Furthermore, we also leverage the packet arrival time at the parameter server, also referred to as Timing Side Information (TSI), to compress the staleness information for the stalenessaware Asynchronous Stochastic Gradients Descent (Asyn-SGD). Numerical results demonstrate the substantial reduction of training time owing to both the worker selection and TSI-aided compression of staleness information.

Original languageEnglish (US)
Title of host publicationGLOBECOM 2023 - 2023 IEEE Global Communications Conference
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1495-1500
Number of pages6
ISBN (Electronic)9798350310900
DOIs
StatePublished - 2023
Externally publishedYes
Event2023 IEEE Global Communications Conference, GLOBECOM 2023 - Kuala Lumpur, Malaysia
Duration: Dec 4 2023Dec 8 2023

Publication series

NameProceedings - IEEE Global Communications Conference, GLOBECOM
ISSN (Print)2334-0983
ISSN (Electronic)2576-6813

Conference

Conference2023 IEEE Global Communications Conference, GLOBECOM 2023
Country/TerritoryMalaysia
CityKuala Lumpur
Period12/4/2312/8/23

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Signal Processing

Keywords

  • Asynchronous optimization
  • federated learning
  • gradient staleness
  • stochastic gradient descent
  • timing side information

Fingerprint

Dive into the research topics of 'Communication-Constrained Distributed Learning: TSI-Aided Asynchronous Optimization with Stale Gradient'. Together they form a unique fingerprint.

Cite this