Improved Information Theoretic Generalization Bounds for Distributed and Federated Learning

L. P. Barnes, A. Dytso, H. V. Poor

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

We consider information-theoretic bounds on expected generalization error for statistical learning problems in a networked setting. In this setting, there are K nodes, each with its own independent dataset, and the models from each node have to be aggregated into a final centralized model. We consider both simple averaging of the models as well as more complicated multi-round algorithms. We give upper bounds on the expected generalization error for a variety of problems, such as those with Bregman divergence or Lipschitz continuous losses, that demonstrate an improved dependence of 1/K on the number of nodes. These "per node"bounds are in terms of the mutual information between the training dataset and the trained weights at each node, and are therefore useful in describing the generalization properties inherent to having communication or privacy constraints at each node.

Original languageEnglish (US)
Title of host publication2022 IEEE International Symposium on Information Theory, ISIT 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1465-1470
Number of pages6
ISBN (Electronic)9781665421591
DOIs
StatePublished - 2022
Externally publishedYes
Event2022 IEEE International Symposium on Information Theory, ISIT 2022 - Espoo, Finland
Duration: Jun 26 2022Jul 1 2022

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
Volume2022-June
ISSN (Print)2157-8095

Conference

Conference2022 IEEE International Symposium on Information Theory, ISIT 2022
Country/TerritoryFinland
CityEspoo
Period6/26/227/1/22

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Improved Information Theoretic Generalization Bounds for Distributed and Federated Learning'. Together they form a unique fingerprint.

Cite this