Bounds on the Sample Complexity of Bayesian Learning Using Information Theory and the VC Dimension

David Haussler, Michael Kearns, Robert E. Schapire

Research output: Contribution to journalArticle

67 Scopus citations

Abstract

In this paper we study a Bayesian or average-case model of concept learning with a twofold goal: to provide more precise characterizations of learning curve (sample complexity) behavior that depend on properties of both the prior distribution over concepts and the sequence of instances seen by the learner, and to smoothly unite in a common framework the popular statistical physics and VC dimension theories of learning curves. To achieve this, we undertake a systematic investigation and comparison of two fundamental quantities in learning and information theory: the probability of an incorrect prediction for an optimal learning algorithm, and the Shannon information gain. This study leads to a new understanding of the sample complexity of learning in several existing models.

Original languageEnglish (US)
Pages (from-to)83-113
Number of pages31
JournalMachine Learning
Volume14
Issue number1
DOIs
StatePublished - Jan 1994

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence

Keywords

  • average-case learning
  • Bayesian learning
  • information theory
  • learning curves
  • statistical physics
  • VC dimension

Fingerprint Dive into the research topics of 'Bounds on the Sample Complexity of Bayesian Learning Using Information Theory and the VC Dimension'. Together they form a unique fingerprint.

Cite this