Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture

Xinyu Tang, Saeed Mahloujifar, Liwei Song, Virat Shejwalkar, Milad Nasr, Amir Houmansadr, Prateek Mittal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Membership inference attacks are a key measure to evaluate privacy leakage in machine learning (ML) models. It is important to train ML models that have high membership privacy while largely preserving their utility. In this work, we propose a new framework to train privacy-preserving models that induce similar behavior on member and non-member inputs to mitigate membership inference attacks. Our framework, called SELENA, has two major components. The first component and the core of our defense is a novel ensemble architecture for training. This architecture, which we call Split-AI, splits the training data into random subsets, and trains a model on each subset of the data. We use an adaptive inference strategy at test time: our ensemble architecture aggregates the outputs of only those models that did not contain the input sample in their training data. Our second component, Self-Distillation, (self-)distills the training dataset through our Split-AI ensemble, without using any external public datasets. We prove that our Split-AI architecture defends against a family of membership inference attacks, however, our defense does not provide provable guarantees against all possible attackers as opposed to differential privacy. This enables us to improve the utility of models compared to DP. Through extensive experiments on major benchmark datasets we show that SELENA presents a superior trade-off between (empirical) membership privacy and utility compared to the state of the art empirical privacy defenses. In particular, SELENA incurs no more than 3.9% drop in classification accuracy compared to the undefended model while reducing the membership inference attack advantage by a factor of up to 3.7 compared to MemGuard and a factor of up to 2.1 compared to adversarial regularization.

Original languageEnglish (US)
Title of host publicationProceedings of the 31st USENIX Security Symposium, Security 2022
PublisherUSENIX Association
Pages1433-1450
Number of pages18
ISBN (Electronic)9781939133311
StatePublished - 2022
Event31st USENIX Security Symposium, Security 2022 - Boston, United States
Duration: Aug 10 2022Aug 12 2022

Publication series

NameProceedings of the 31st USENIX Security Symposium, Security 2022

Conference

Conference31st USENIX Security Symposium, Security 2022
Country/TerritoryUnited States
CityBoston
Period8/10/228/12/22

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture'. Together they form a unique fingerprint.

Cite this