Maximum entropy distribution estimation with generalized regularization

Miroslav Dudík, Robert E. Schapire

Research output: Chapter in Book/Report/Conference proceedingConference contribution

31 Scopus citations


We present a unified and complete account of maximum entropy distribution estimation subject to constraints represented by convex potential functions or, alternatively, by convex regularization. We provide fully general performance guarantees and an algorithm with a complete convergence proof. As special cases, we can easily derive performance guarantees for many known regularization types, including ℓ1, ℓ2, ℓ2 2 and ℓ1+ ℓ22 style regularization. Furthermore, our general approach enables us to use information about the structure of the feature space or about sample selection bias to derive entirely new regularization functions with superior guarantees. We propose an algorithm solving a large and general subclass of generalized maxent problems, including all discussed in the paper, and prove its convergence. Our approach generalizes techniques based on information geometry and Bregman divergences as well as those based more directly on compactness.

Original languageEnglish (US)
Title of host publicationLearning Theory - 19th Annual Conference on Learning Theory, COLT 2006, Proceedings
PublisherSpringer Verlag
Number of pages16
ISBN (Print)3540352945, 9783540352945
StatePublished - 2006
Externally publishedYes
Event19th Annual Conference on Learning Theory, COLT 2006 - Pittsburgh, PA, United States
Duration: Jun 22 2006Jun 25 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4005 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other19th Annual Conference on Learning Theory, COLT 2006
Country/TerritoryUnited States
CityPittsburgh, PA

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Maximum entropy distribution estimation with generalized regularization'. Together they form a unique fingerprint.

Cite this