Optimal lossless data compression: Non-asymptotics and asymptotics

Ioannis Kontoyiannis, Sergio Verdu

Research output: Contribution to journalArticlepeer-review

103 Scopus citations

Abstract

This paper provides an extensive study of the behavior of the best achievable rate (and other related fundamental limits) in variable-length strictly lossless compression. In the non-asymptotic regime, the fundamental limits of fixed-to-variable lossless compression with and without prefix constraints are shown to be tightly coupled. Several precise, quantitative bounds are derived, connecting the distribution of the optimal code lengths to the source information spectrum, and an exact analysis of the best achievable rate for arbitrary sources is given. Fine asymptotic results are proved for arbitrary (not necessarily prefix) compressors on general mixing sources. Nonasymptotic, explicit Gaussian approximation bounds are established for the best achievable rate on Markov sources. The source dispersion and the source varentropy rate are defined and characterized. Together with the entropy rate, the varentropy rate serves to tightly approximate the fundamental nonasymptotic limits of fixed-to-variable compression for all but very small block lengths.

Original languageEnglish (US)
Article number6665143
Pages (from-to)777-795
Number of pages19
JournalIEEE Transactions on Information Theory
Volume60
Issue number2
DOIs
StatePublished - Feb 2014

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Keywords

  • Lossless data compression
  • Markov sources
  • central limit theorem
  • entropy
  • finite-block length fundamental limits
  • fixed-to-fixed source coding
  • fixed-to-variable source coding
  • minimal coding variance
  • source dispersion
  • varentropy

Fingerprint

Dive into the research topics of 'Optimal lossless data compression: Non-asymptotics and asymptotics'. Together they form a unique fingerprint.

Cite this