Abstract
Federated learning (FedL) has emerged as a popular technique for distributing model training over a set of wireless devices, via iterative local updates (at devices) and global aggregations (at the server). In this paper, we develop <italic>parallel successive learning</italic> (PSL), which expands the FedL architecture along three dimensions: (i) <italic>Network</italic>, allowing decentralized cooperation among the devices via device-to-device (D2D) communications. (ii) <italic>Heterogeneity</italic>, interpreted at three levels: (ii-a) Learning: PSL considers heterogeneous number of stochastic gradient descent iterations with different mini-batch sizes at the devices; (ii-b) Data: PSL presumes a <italic>dynamic environment</italic> with data arrival and departure, where the distributions of local datasets evolve over time, captured via a new metric for <italic>model/concept drift</italic>. (ii-c) Device: PSL considers devices with different computation and communication capabilities. (iii) <italic>Proximity</italic>, where devices have different distances to each other and the access point. PSL considers the realistic scenario where global aggregations are conducted with <italic>idle times</italic> in-between them for resource efficiency improvements, and incorporates <italic>data dispersion</italic> and <italic>model dispersion with local model condensation</italic> into FedL. Our analysis sheds light on the notion of <italic>cold</italic> vs. <italic>warmed up</italic> models, and model <italic>inertia</italic> in distributed machine learning. We then propose <italic>network-aware dynamic model tracking</italic> to optimize the model learning vs. resource efficiency tradeoff, which we show is an NP-hard signomial programming problem. We finally solve this problem through proposing a general optimization solver. Our numerical results reveal new findings on the interdependencies between the idle times in-between the global aggregations, model/concept drift, and D2D cooperation configuration.
Original language | English (US) |
---|---|
Pages (from-to) | 1-16 |
Number of pages | 16 |
Journal | IEEE/ACM Transactions on Networking |
DOIs | |
State | Accepted/In press - 2023 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Software
- Computer Science Applications
- Computer Networks and Communications
- Electrical and Electronic Engineering
Keywords
- Computational modeling
- Cooperative federated learning
- Data models
- Device-to-device communication
- device-to-device communications
- Dispersion
- Distributed databases
- dynamic machine learning
- network optimization
- Training
- Wireless sensor networks