Utility-privacy tradeoffs in databases: An information-theoretic approach

Lalitha Sankar, S. Raj Rajagopalan, H. Vincent Poor

Research output: Contribution to journalArticlepeer-review

296 Scopus citations

Abstract

Ensuring the usefulness of electronic data sources while providing necessary privacy guarantees is an important unsolved problem. This problem drives the need for an analytical framework that can quantify the privacy of personally identifiable information while still providing a quantifiable benefit (utility) to multiple legitimate information consumers. This paper presents an information-theoretic framework that promises an analytical model guaranteeing tight bounds of how much utility is possible for a given level of privacy and vice-versa. Specific contributions include: 1) stochastic data models for both categorical and numerical data; 2) utility-privacy tradeoff regions and the encoding (sanization) schemes achieving them for both classes and their practical relevance; and 3) modeling of prior knowledge at the user and/or data source and optimal encoding schemes for both cases.

Original languageEnglish (US)
Article number6482222
Pages (from-to)838-852
Number of pages15
JournalIEEE Transactions on Information Forensics and Security
Volume8
Issue number6
DOIs
StatePublished - 2013
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Safety, Risk, Reliability and Quality
  • Computer Networks and Communications

Keywords

  • Utility
  • databases
  • equivocation
  • privacy
  • rate-distortion theory
  • side information

Fingerprint

Dive into the research topics of 'Utility-privacy tradeoffs in databases: An information-theoretic approach'. Together they form a unique fingerprint.

Cite this