Improved Information-Theoretic Generalization Bounds for Distributed, Federated, and Iterative Learning †

Leighton Pate Barnes, Alex Dytso, Harold Vincent Poor

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

We consider information-theoretic bounds on the expected generalization error for statistical learning problems in a network setting. In this setting, there are K nodes, each with its own independent dataset, and the models from the K nodes have to be aggregated into a final centralized model. We consider both simple averaging of the models as well as more complicated multi-round algorithms. We give upper bounds on the expected generalization error for a variety of problems, such as those with Bregman divergence or Lipschitz continuous losses, that demonstrate an improved dependence of (Formula presented.) on the number of nodes. These “per node” bounds are in terms of the mutual information between the training dataset and the trained weights at each node and are therefore useful in describing the generalization properties inherent to having communication or privacy constraints at each node.

Original languageEnglish (US)
Article number1178
JournalEntropy
Volume24
Issue number9
DOIs
StatePublished - Sep 2022
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Electrical and Electronic Engineering
  • General Physics and Astronomy
  • Mathematical Physics
  • Physics and Astronomy (miscellaneous)

Keywords

  • distribution and federated learning
  • generalization error
  • information-theoretic bounds

Fingerprint

Dive into the research topics of 'Improved Information-Theoretic Generalization Bounds for Distributed, Federated, and Iterative Learning †'. Together they form a unique fingerprint.

Cite this